当前位置: 首页> 汽车> 行情 > 【Spark】HashMap转RDD

【Spark】HashMap转RDD

时间:2025/7/11 14:56:48来源:https://blog.csdn.net/d905133872/article/details/140656517 浏览次数: 0次

1、读取本地文件,转换成map

val path = "文件路径"
val source = Source.fromFile(path).getLines().toList.mkString("").replaceAll(" ","")val key = JSON.parseObject(source).get("key").toString
val columns = JSON.parseObject(source).get("value").toStringval map = new util.HashMap[String, String]()
map.put("RK", getValue(key))JSON.parseObject(columns.toString).keySet().asScala.foreach(elem => {val valueJson = JSON.parseObject(columns.toString).get(elem).toStringmap.put(elem, getValue(valueJson))
})def getValue(str: String): String = {val value = str.toString.replace("[","").replace("]","")JSON.parseObject(value).get("value").toString
}

2、将map转变成rdd

val schema = StructType(map.asScala.toSeq.map {case(k,v) =>StruchField(k, StringType, nullable = true)
})val row = Row.fromSeq(map.values().asScala.toSeq)val rowRDD = spark.sparkContext.parallelize(Seq(row))val df = spark.createDataFrame(rowRDD, schema)

备注:数据格式

{"key":[{"name":"RK","type":"String","value":"1234567890"}],"columns":{"column_name1":["name":"column_name1","type":"String","value":"111"   ],"column_name2":["name":"column_name2","type":"String","value":"222"   ],"column_name3":["name":"column_name3","type":"String","value":"333"   ]}
}

关键字:【Spark】HashMap转RDD

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com

责任编辑: