Spark Window Functions for DataFrames and SQL
VIA: http://xinhstechblog.blogspot.com/2016/04/spark-window-functions-for-dataframes.html Spark Window Functions for DataFrames and SQL Introduced in Spark 1.4, Spark window functions improved the expressiveness of Spark DataFrames and Spark SQL. With window functions, you can easily calculate a moving average or cumulative sum, or reference a value in a previous row of a table. Window functions allow you to do many common calculations with DataFrames, without having to resort to RDD manipulation. Aggregates, UDFs vs. Window functions Window functions are complementary to existing DataFrame operations: aggregates, such as sum and avg , and UDFs. To review, aggregates calculate one result, a sum or average, for each group of rows, whereas UDFs calculate one result for each row based on only data in that row. In contrast, window functions calculate one result for each row based on a window of rows. For example, in a moving average, you calculate for eac...