The invocations of element access functions in parallel algorithms invoked with
an execution policy object of type 
execution::parallel_policy
are permitted to execute either
in the invoking thread of execution or
in a thread of execution implicitly created by the library
to support parallel algorithm execution
.If the threads of execution created by 
thread (
[thread.thread.class])
or 
jthread (
[thread.jthread.class])
provide concurrent forward progress guarantees (
[intro.progress]),
then a thread of execution implicitly created by the library will provide
parallel forward progress guarantees;
otherwise, the provided forward progress guarantee is
implementation-defined
.Any such invocations executing in the same thread of execution
are indeterminately sequenced with respect to each other
.[
Note 6: 
It is the caller's responsibility to ensure
that the invocation does not introduce data races or deadlocks
. — 
end note]
[
Example 1: 
int a[] = {0,1};
std::vector<int> v;
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int i) {
  v.push_back(i*2+1);                   
});
 
The program above has a data race because of the unsynchronized access to the
container 
v. — 
end example]
[
Example 2: 
std::atomic<int> x{0};
int a[] = {1,2};
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int) {
  x.fetch_add(1, std::memory_order::relaxed);
  
  while (x.load(std::memory_order::relaxed) == 1) { }           
});
 
The above example depends on the order of execution of the iterations, and
will not terminate if both iterations are executed sequentially
on the same thread of execution
. — 
end example]
[
Example 3: 
int x = 0;
std::mutex m;
int a[] = {1,2};
std::for_each(std::execution::par, std::begin(a), std::end(a), [&](int) {
  std::lock_guard<mutex> guard(m);
  ++x;
});
 
The above example synchronizes access to object 
x
ensuring that it is incremented correctly
. — 
end example]