I am trying to use Akka Stream to process a large file (source is a file, sink is a simple println
sink), and here is my code:
final Path file = Paths.get("f:/someFile"); // sink, do something interesting later Sink<ByteString, CompletionStage<Done>> printlnSink = Sink.<ByteString>foreach(chunk -> { String line = chunk.utf8String(); // turn each chunk to a String System.out.println(line); }); // flow, which has a chunk size ByteString separator1 = ByteString.fromString("."); // this will work ByteString separator2 = ByteString.fromString(".\n"); // this will NOT work final Flow<ByteString, ByteString, NotUsed> flow = Framing.delimiter(separator1, 1000, FramingTruncation.ALLOW); // put them together and let it run CompletionStage<IOResult> ioResult = FileIO .fromPath(file) .via(flow) .to(printlnSink) .run(system);
so if I use separator1
to Framing.delimiter(...)
, it will work fine, but if I use separator2
to Framing.delimiter(...)
, it will not work at all.
Here is part of the file I want to process:
@prefix ns1: <http://www.w3.org/2004/02/skos/core#> . @prefix ns2: <http://hadatac.org/ont/hasco/> . @prefix ns3: <http://purl.org/dc/terms/> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix sio: <http://semanticscience.org/resource/> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . ns2:Cohort a owl:Class ; rdfs:subClassOf <http://purl.org/twc/HHEAR_00050> . ns2:wasApprovedBy a owl:AnnotationProperty .
As you can tell, I cannot use a single .
as the separator because it happens in the middle of the sentence in many places. That is why I need to use .\n
as the separator, which does not work.
Can someone help me to check what did I do wrong?
https://stackoverflow.com/questions/66020663/strange-framing-delimiter-behavior February 03, 2021 at 11:05AM
没有评论:
发表评论