Skip to content

Timings for PFAC_matchFromDeviceReduce vs PFAC_matchFromHostReduce #1

@GoogleCodeExporter

Description

@GoogleCodeExporter
What version of the product are you using? On what operating system?
PFAC 1.0, on RHEL 6

Please provide any additional information below.

I measured the time it takes for PFAC_matchFromHostReduce and the equivalent 
steps when using PFAC_matchFromDeviceReduce. Both functions take about the same 
time to complete when the size of the input string is 100MB.

Timing for PFAC_matchFromHostReduce: 56 ms

Timing for equivalent steps using PFAC_matchFromDeviceReduce:
cudaMalloc: 0.3 ms
cudaMemcpy(d_input_string, h_input_string, input_size, cudaMemcpyHostToDevice): 
18 ms
PFAC_matchFromDeviceReduce: 26 ms
cudaMemcpy of d_pos and d_match_result back to CPU: 0.3 ms
cudaFree of d_input_string, d_pos and d_match_result: 11 ms
Total: 57 ms

Original issue reported on code.google.com by hja...@ymail.com on 29 Apr 2011 at 1:49

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions