ПФ
Size: a a a
ПФ
PK
K
ПФ
ПФ
t
ПФ
МН
PK
ПФ
ПФ
K
K
Return an RDD created by piping elements to a forked external process. The resulting RDD is computed by executing the given process once per partition. All elements of each input partition are written to a process's stdin as lines of input separated by a newline. The resulting partition consists of the process's stdout output, with each line of stdout resulting in one element of the output partition. A process is invoked even for empty partitions.
The print behavior can be customized by providing two functions.
Params:
command – command to run in forked process.
env – environment variables to set.
printPipeContext – Before piping elements, this function is called as an opportunity to pipe context data. Print line function (like out.println) will be passed as printPipeContext's parameter.
printRDDElement – Use this function to customize how to pipe elements. This function will be called with each RDD element as the 1st parameter, and the print line function (like out.println()) as the 2nd parameter. An example of pipe the RDD data of groupBy() in a streaming way, instead of constructing a huge String to concat all the elements:
def printRDDElement(record:(String, Seq[String]), f:String=>Unit) =
for (e <- record._2) {f(e)}
separateWorkingDir – Use separate working directories for each task.
bufferSize – Buffer size for the stdin writer for the piped process.
encod
K
K
ПФ
Return an RDD created by piping elements to a forked external process. The resulting RDD is computed by executing the given process once per partition. All elements of each input partition are written to a process's stdin as lines of input separated by a newline. The resulting partition consists of the process's stdout output, with each line of stdout resulting in one element of the output partition. A process is invoked even for empty partitions.
The print behavior can be customized by providing two functions.
Params:
command – command to run in forked process.
env – environment variables to set.
printPipeContext – Before piping elements, this function is called as an opportunity to pipe context data. Print line function (like out.println) will be passed as printPipeContext's parameter.
printRDDElement – Use this function to customize how to pipe elements. This function will be called with each RDD element as the 1st parameter, and the print line function (like out.println()) as the 2nd parameter. An example of pipe the RDD data of groupBy() in a streaming way, instead of constructing a huge String to concat all the elements:
def printRDDElement(record:(String, Seq[String]), f:String=>Unit) =
for (e <- record._2) {f(e)}
separateWorkingDir – Use separate working directories for each task.
bufferSize – Buffer size for the stdin writer for the piped process.
encod
ПФ
K
ПФ
K