feat: pxf fdw support parallel scan#61
Conversation
| // Parallel mode: only process the specified fragment | ||
| Fragment specificFragment = fragmenterService.getFragmentByIndex( | ||
| context, context.getSpecificFragmentIndex()); | ||
| fragments = java.util.Collections.singletonList(specificFragment); |
There was a problem hiding this comment.
NIT: import java.util.Collections?
fdw/pxf_bridge.h
Outdated
| slock_t mutex; /* mutex for accessing shared state */ | ||
| int total_fragments; /* total number of fragments */ | ||
| int next_fragment; /* next fragment index to be assigned */ | ||
| bool finished; /* true if all fragments have been processed */ |
There was a problem hiding this comment.
true if all fragments have been processed... or cancelled?
And. What is a purpose of write-only variable?
|
@ostinru My approach to kernel parallel processing is still too simplistic. Maybe I will change or refactor later. |
|
All deployments are local.
When exploring parallelization, the good news is that parallelization does indeed improve efficiency. For small data volumes, the improvement is not obvious and may even be less efficient than non-parallel processing. Only when the data volume is large does it show a noticeable improvement. However, the current improvement still falls short of the expected level. Theoretically, the speedup factor should be almost equal to the number of workers. The reason it hasn’t reached the expected level may be due to bottlenecks in I/O or CPU. Further exploration will be conducted in the future. |
a83213f to
9f68aec
Compare
- Added PxfBridgeImportStartVirtual function to manage imports with virtual segment IDs. - Updated PxfFdwScanState structure to include fields for gang-parallel execution. - Enhanced foreign scan functions to support gang-parallel mode, ensuring unique fragment distribution among workers. - Implemented initialization and cleanup routines for gang-parallel state management.
9f68aec to
71f0bed
Compare
#58
Change logs
Currently, parallel FDW is supported. This implementation depends on the kernel's commit.
The current code is not yet ready for the review stage. This current commit is only an exploratory submission for FDW parallelization. More importantly, I need to ensure that the core part of the kernel is solid first.
apache/cloudberry#1571
Contributor's checklist
Here are some reminders before you submit your pull request: