Interaction between peach and other optimisations

kdb+

Interaction between peach and other optimisations

Posted by erichards on November 17, 2023 at 12:00 am
I understand there are various parallel optimisations that happen under the hood when running with some number of secondary threads, e.g. summing across multiple partitions. How do these interact with peach?

For example:

disk0/hdb/par.txt –> disk1/hdb/partitions , disk2/hdb/partitions disk1/hdb/partitions/1-3-5 disk2/hdb/partitions/2-4-6

If I ran a query such as
```
select sum price by sym where int within (1;4)
```
and I had two secondary threads available, thread #1 would retrieve data from partitions 1, 3 on disk 1, and thread #2 would retrieve data from partitions 2, 4 on disk 2 to maximise I/O throughput.

But if my queries were wrapped in peach, would this still be possible, given peach would be using all available threads, e.g.
```
{x[]} peach ( {select sum price by sym where int within (1;4)}; {select sum price by sym where int within (5;6)} )
```
So are there situations when using peach can reduce performance? Thank you
erichards replied 1 year, 2 months ago 2 Members · 4 Replies
4 Replies

rocuinneagain

Member
November 17, 2023 at 12:00 am
The parallelism can only go one layer deep.

.i.ie These 2 statements end up executing the same path. In the first one the inner “peach“ can only run like an `each` as it is already in a thread:
```
data:8#enlist til 1000000 ts {{neg x} peach x} peach data 553 1968 ts {{neg x} each x} peach data 551 1936
```
For queries map-reduce still will be used to reduce the memory load of your nested queries even if run inside a “peach` even if not running the sub parts in parallel.

https://code.kx.com/q4m3/14_Introduction_to_Kdb%2B/#1437-map-reduce

Where you choose to put your `peach` can be important and change the performance of your execution.

My example actually runs better without peach due to the overhead of passing data around versus `neg` being a simple operation
```
ts {{neg x} each x} each data 348 91498576
```
.Q.fc exists to help in these cases
```
ts {.Q.fc[{neg x};x]} each data 19 67110432
```
https://code.kx.com/q/ref/dotq/#fc-parallel-on-cut

And in fact since `neg` has native multithreading and operates on vectors and vectors of vectors it is best of off left on it’s own:
```
ts neg each data 5 67109216 
ts neg data 5 67109104 
neg data
```
This example of course is extreme but does show that thought and optimisation can go in to each use-case on where to choose to iterate and place `peach“
erichards

Member
November 17, 2023 at 12:00 am

I guess a more succint version of my question is “what happens to native parallelisations when running queries inside an instance of peach?”
erichards

Member
November 20, 2023 at 12:00 am

Many thanks for the reply and examples.

“in fact since `neg` has native multithreading and operates on vectors and vectors of vectors it is best of off left on it’s own”

This is what I was keen to understand, and it’s useful to know that there are cases when you may be better off without peach.
rocuinneagain

Member
February 23, 2024 at 12:00 am

kdb+ 4.1 has been released with some interesting improvements for peach which changes some of my answers as nesting is now supported

https://code.kx.com/q//releases/ChangesIn4.1/#peachparallel-processing-enhancements

KX Community

Interaction between peach and other optimisations

rocuinneagain

erichards

erichards

rocuinneagain