Could anyone please let me know how to read parquet files in C#. I have tried using parquet.net. It works fine when generating parquet files, but getting the below issue when reading the parquet. However, this file is generated using the same code mentioned in https://github.com/elastacloud/parquet-dotnet. I have validated the parquet file and its valid as well.
"message": "not a Parquet file(head is '')", Below is the code that I have used to read
using System.Collections.Generic; using Parquet; using Parquet.Data; using System.Linq; using System.Text; using System.IO; namespace ReadParquet { class Program { static void Main(string[] args) { // open file stream using (Stream fileStream = System.IO.File.OpenRead("C:\\Users\\snelaturu\\work\\test.parquet")) { // open parquet file reader using (var parquetReader = new ParquetReader(fileStream)) { // get file schema (available straight after opening parquet reader) // however, get only data fields as only they contain data values DataField[] dataFields = parquetReader.Schema.GetDataFields(); // enumerate through row groups in this file for (int i = 0; i < parquetReader.RowGroupCount; i++) { // create row group reader using (ParquetRowGroupReader groupReader = parquetReader.OpenRowGroupReader(i)) { // read all columns inside each row group (you have an option to read only // required columns if you need to. DataColumn[] columns = dataFields.Select(groupReader.ReadColumn).ToArray(); // get first column, for instance DataColumn firstColumn = columns[0]; // .Data member contains a typed array of column data you can cast to the type of the column Array data = firstColumn.Data; int[] ids = (int[])data; } } } } } } } https://stackoverflow.com/questions/66057229/how-to-read-parquet-files-in-c-sharp February 05, 2021 at 11:37AM
没有评论:
发表评论