2021年2月2日星期二

strange Framing.delimiter behavior

I am trying to use Akka Stream to process a large file (source is a file, sink is a simple println sink), and here is my code:

final Path file = Paths.get("f:/someFile");    // sink, do something interesting later  Sink<ByteString, CompletionStage<Done>> printlnSink = Sink.<ByteString>foreach(chunk -> {      String line = chunk.utf8String();  // turn each chunk to a String      System.out.println(line);  });    // flow, which has a chunk size  ByteString separator1 = ByteString.fromString(".");    // this will work  ByteString separator2 = ByteString.fromString(".\n");  // this will NOT work  final Flow<ByteString, ByteString, NotUsed> flow = Framing.delimiter(separator1, 1000, FramingTruncation.ALLOW);    // put them together and let it run  CompletionStage<IOResult> ioResult = FileIO          .fromPath(file)          .via(flow)          .to(printlnSink)          .run(system);  

so if I use separator1 to Framing.delimiter(...), it will work fine, but if I use separator2 to Framing.delimiter(...), it will not work at all.

Here is part of the file I want to process:

@prefix ns1: <http://www.w3.org/2004/02/skos/core#> .  @prefix ns2: <http://hadatac.org/ont/hasco/> .  @prefix ns3: <http://purl.org/dc/terms/> .  @prefix owl: <http://www.w3.org/2002/07/owl#> .  @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .  @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .  @prefix sio: <http://semanticscience.org/resource/> .  @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .    ns2:Cohort a owl:Class ;      rdfs:subClassOf <http://purl.org/twc/HHEAR_00050> .    ns2:wasApprovedBy a owl:AnnotationProperty .  

As you can tell, I cannot use a single . as the separator because it happens in the middle of the sentence in many places. That is why I need to use .\n as the separator, which does not work.

Can someone help me to check what did I do wrong?

https://stackoverflow.com/questions/66020663/strange-framing-delimiter-behavior February 03, 2021 at 11:05AM

没有评论:

发表评论