Spark获取DataFrame中列的方式--col,$,column,apply
阅读原文时间:2023年07月09日阅读:3

Spark获取DataFrame中列的方式--col,$,column,apply



原文作者:大葱拌豆腐

原文地址:Spark获取DataFrame中列的几种姿势–col,$,column,apply



1、官方说明

df("columnName")            // On a specific DataFrame.
   col("columnName")           // A generic column no yet associated with a DataFrame.
   col("columnName.field")     // Extracting a struct field
   col("`a.column.with.dots`") // Escape `.` in column names.
   $"columnName"               // Scala short hand for a named column.
   expr("a + 1")               // A column that is constructed from a parsed SQL Expression.
   lit("abc")                  // A column that produces a literal (constant) value.

2、使用时涉及到的的包

   import spark.implicits._
   import org.apache.spark.sql.functions._
   import org.apache.spark.sql.Column

3、Demo

scala> val idCol = $"id"
idCol: org.apache.spark.sql.ColumnName = id

scala> val idCol = col("id")
idCol: org.apache.spark.sql.Column = id

scala> val idCol = column("id")
idCol: org.apache.spark.sql.Column = id


scala> val dataset = spark.range(5).toDF("text")
dataset: org.apache.spark.sql.DataFrame = [text: bigint]

scala> val textCol = dataset.col("text")
textCol: org.apache.spark.sql.Column = text

scala> val textCol = dataset.apply("text")
textCol: org.apache.spark.sql.Column = text

scala> val textCol = dataset("text")
textCol: org.apache.spark.sql.Column = text