Posts

Showing posts from 2016

Spark Window Functions for DataFrames and SQL

VIA: http://xinhstechblog.blogspot.com/2016/04/spark-window-functions-for-dataframes.html Spark Window Functions for DataFrames and SQL Introduced in Spark 1.4, Spark window functions improved the expressiveness of Spark DataFrames and Spark SQL. With window functions, you can easily calculate a moving average or cumulative sum, or reference a value in a previous row of a table. Window functions allow you to do many common calculations with DataFrames, without having to resort to RDD manipulation. Aggregates, UDFs vs. Window functions Window functions are complementary to existing DataFrame operations: aggregates, such as  sum and  avg , and UDFs. To review, aggregates calculate one result, a sum or average, for each group of rows, whereas UDFs calculate one result for each row based on only data in that row. In contrast, window functions calculate one result for each row based on a window of rows. For example, in a moving average, you calculate for eac...

[Python] Using % and .format() for great good!

Via: https://pyformat.info/ Contribute on GitHub Py Format  Using  %  and  .format()  for great good! Python has had awesome string formatters for many years but the documentation on them is far too theoretic and technical. With this site we try to show you the most common use-cases covered by the  old  and new  style string formatting API with practical examples. If not otherwise stated all examples work with Python 2.7, 3.2, 3.3, and 3.4 without requiring any additional libraries or monkey-patching. Further details about these two formatting methods can be found in the official Python documentation: old style new style If you want to contribute more examples, feel free to create a pull-request on  Github ! Table of Contents: Basic formatting Value conversion Padding and aligning strings Truncating long strings Combining truncating and padding Numbers Padding numbers Signed numbers Named placeholders Getite...