I'm working with a hashtable which I've built using a list of 3.5 million IP addresses stored in CSV format, and I am trying to search through this table using wildcards.
The CSV is MaxMind's list of IPs, which I convert to Hashtable using the following code
[System.IO.File]::ReadLines("C:\temp\iptest.csv") | ForEach-Object { $data= $_.split(','); $ht = @{"geoname_id"="$($data[1])";"registered_country_geoname_id"="$($data[2])"} $name = $($data[0]) $mainIPHhash.add($name, $ht)}
The code just pulls out the CIDR and it's corresponding City/Country code. This works well, and builds the table in a little over two minutes, but the issue I am now facing is searching this hashtable for wild card entries.
If I search for a complete CIDR, the search happens in milliseconds
$mainIPHhash.item("1.0.0.0/24") Measure command reports - TotalSeconds : 0.0001542
But if I need to do a wildcard search, it has to loop through the hashtable looking for my like values, which takes a long time!
$testingIP = "1.0.*" $mainIPHhash.GetEnumerator() | Where-Object { $_.key -like $testingIP } Measure command reports - TotalSeconds : 33.3016279
Is there a better way for searching wildcard entries in Hashtables?
Cheers
Edit:
Using a regex search, I can get it down to 19 seconds. But still woefully slow
$findsStr = "^$(($testingIP2).split('.')[0])" +"\."+ "$(($testingIP2).split('.')[1])" +"\." $mainIPHhash.GetEnumerator() | foreach {if($_.Key -match $findsStr){#Dostuff }}
The above takes the first two octets of the IP address, and uses regex to find them in the hashtable.
Days : 0 Hours : 0 Minutes : 0 Seconds : 19 Milliseconds : 733 Ticks : 197339339 TotalDays : 0.000228402012731481 TotalHours : 0.00548164830555556 TotalMinutes : 0.328898898333333 TotalSeconds : 19.7339339 TotalMilliseconds : 19733.9339
https://stackoverflow.com/questions/65386279/powershell-creating-hashtables-from-large-text-files-and-searching December 21, 2020 at 08:49AM
没有评论:
发表评论