aggregate function Count usage with groupBy in Spark

We Are Going To Discuss About aggregate function Count usage with groupBy in Spark. So lets Start this Java Article.

aggregate function Count usage with groupBy in Spark

Advertisements
  1. aggregate function Count usage with groupBy in Spark

    count() can be used inside agg() as groupBy expression is same.
    With Python
    import pyspark.sql.functions as func new_log_df.cache().withColumn("timePeriod", encodeUDF(new_log_df["START_TIME"])) .groupBy("timePeriod")

  2. aggregate function Count usage with groupBy in Spark

    count() can be used inside agg() as groupBy expression is same.
    With Python
    import pyspark.sql.functions as func new_log_df.cache().withColumn("timePeriod", encodeUDF(new_log_df["START_TIME"])) .groupBy("timePeriod")

Solution 1

Advertisements

count() can be used inside agg() as groupBy expression is same.

With Python

import pyspark.sql.functions as func

new_log_df.cache().withColumn("timePeriod", encodeUDF(new_log_df["START_TIME"])) 
  .groupBy("timePeriod")
  .agg(
     func.mean("DOWNSTREAM_SIZE").alias("Mean"), 
     func.stddev("DOWNSTREAM_SIZE").alias("Stddev"),
     func.count(func.lit(1)).alias("Num Of Records")
   )
  .show(20, False)

pySpark SQL functions doc

With Scala

import org.apache.spark.sql.functions._ //for count()

new_log_df.cache().withColumn("timePeriod", encodeUDF(col("START_TIME"))) 
  .groupBy("timePeriod")
  .agg(
     mean("DOWNSTREAM_SIZE").alias("Mean"), 
     stddev("DOWNSTREAM_SIZE").alias("Stddev"),
     count(lit(1)).alias("Num Of Records")
   )
  .show(20, false)

count(1) will count the records by first column which is equal to count("timePeriod")

With Java

import static org.apache.spark.sql.functions.*;

new_log_df.cache().withColumn("timePeriod", encodeUDF(col("START_TIME"))) 
  .groupBy("timePeriod")
  .agg(
     mean("DOWNSTREAM_SIZE").alias("Mean"), 
     stddev("DOWNSTREAM_SIZE").alias("Stddev"),
     count(lit(1)).alias("Num Of Records")
   )
  .show(20, false)

Original Author mrsrinivas Of This Content

Conclusion

Advertisements

So This is all About This Tutorial. Hope This Tutorial Helped You. Thank You.

Also Read,

Advertisements
Siddharth

I am an Information Technology Engineer. I have Completed my MCA And I have 4 Year Plus Experience, I am a web developer with knowledge of multiple back-end platforms Like PHP, Node.js, Python and frontend JavaScript frameworks Like Angular, React, and Vue.

Leave a Comment