<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: ML Spark Exception for processing logs with IPv6 Addresses in Expedition Discussions</title>
    <link>https://live.paloaltonetworks.com/t5/expedition-discussions/ml-spark-exception-for-processing-logs-with-ipv6-addresses/m-p/227472#M383</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;thanks we are working to fix it. By now ipv6 analisys will not be supported yet. But we will prevent spark to quit when an ipv6 address is found.&lt;/P&gt;</description>
    <pubDate>Fri, 17 Aug 2018 14:44:28 GMT</pubDate>
    <dc:creator>alestevez</dc:creator>
    <dc:date>2018-08-17T14:44:28Z</dc:date>
    <item>
      <title>ML Spark Exception for processing logs with IPv6 Addresses</title>
      <link>https://live.paloaltonetworks.com/t5/expedition-discussions/ml-spark-exception-for-processing-logs-with-ipv6-addresses/m-p/227470#M382</link>
      <description>&lt;P&gt;Getting the following error when trying to process CSV:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;U&gt;&lt;STRONG&gt;Exception:&lt;/STRONG&gt;&lt;/U&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#FF0000"&gt;&lt;STRONG&gt;Caused by: java.lang.NumberFormatException: For input string: "2001:470:ba7e:20::254"&lt;/STRONG&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;&lt;STRONG&gt;at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;U&gt;&lt;STRONG&gt;Full trace:&amp;nbsp;&lt;/STRONG&gt;&lt;/U&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;(/opt/Spark/spark/bin/spark-submit --class com.paloaltonetworks.tbd.LogCollectorCompacter --deploy-mode client --supervise /var/www/html/OS/spark/packages/LogCoCo-1.2.4-SNAPSHOT.jar MLServer='10.10.50.100', master='local[3]', debug='false', taskID='65', user='admin', dbUser='root', dbPass='paloalto', dbServer='10.10.50.100:3306', timeZone='Europe/Helsinki', mode='Expedition', input='007254000047808:8.0.3:/var/backup/fw1_traffic_2018_08_17_last_calendar_day.csv', output='/var/expedition/connections.parquet', tempFolder='/var/expedition'; echo /var/backup/fw1_traffic_2018_08_17_last_calendar_day.csv; )&amp;gt;&amp;gt; "/tmp/error_logCoCo" 2&amp;gt;&amp;gt;/tmp/error_logCoCo &amp;amp;&lt;BR /&gt; ---- CREATING SPARK Session:&lt;BR /&gt; warehouseLocation:/PALogs/spark-warehouse&lt;BR /&gt;SLF4J: Class path contains multiple SLF4J bindings.&lt;BR /&gt;SLF4J: Found binding in [jar:file:/opt/Spark/extraLibraries/slf4j-nop-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]&lt;BR /&gt;SLF4J: Found binding in [jar:file:/opt/Spark/spark-2.1.1-bin-hadoop2.7/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]&lt;BR /&gt;SLF4J: See &lt;A href="http://www.slf4j.org/codes.html#multiple_bindings" target="_blank"&gt;http://www.slf4j.org/codes.html#multiple_bindings&lt;/A&gt; for an explanation.&lt;BR /&gt;SLF4J: Actual binding is of type [org.slf4j.helpers.NOPLoggerFactory]&lt;BR /&gt;+--------------------+---------------+--------+--------------------+&lt;BR /&gt;| rowLine| fwSerial|panosver| csvpath|&lt;BR /&gt;+--------------------+---------------+--------+--------------------+&lt;BR /&gt;|007254000047808:8...|007254000047808| 8.0.3|/var/backup/fw1_t...|&lt;BR /&gt;+--------------------+---------------+--------+--------------------+&lt;/P&gt;
&lt;P&gt;8.0.0:/var/backup/fw1_traffic_2018_08_17_last_calendar_day.csv&lt;BR /&gt;LogCollector&amp;amp;Compacter called with the following parameters:&lt;BR /&gt; Parameters for execution&lt;BR /&gt; Master[processes]:............ local[3]&lt;BR /&gt; User:......................... admin&lt;BR /&gt; debug:........................ false&lt;BR /&gt; Parameters for Job Connections&lt;BR /&gt; Task ID:...................... 65&lt;BR /&gt; My IP:........................ 10.10.50.100&lt;BR /&gt; Expedition IP:................ 10.10.50.100:3306&lt;BR /&gt; Time Zone:.................... Europe/Helsinki&lt;BR /&gt; dbUser (dbPassword):.......... root (************)&lt;BR /&gt; projectName:.................. demo&lt;BR /&gt; Parameters for Data Sources&lt;BR /&gt; App Categories (source):........ (Expedition)&lt;BR /&gt; CSV Files Path:.................007254000047808:8.0.3:/var/backup/fw1_traffic_2018_08_17_last_calendar_day.csv&lt;BR /&gt; Parquet output path:.......... file:///var/expedition/connections.parquet&lt;BR /&gt; Temporary folder:............. /var/expedition&lt;BR /&gt; ---- AppID DB LOAD:&lt;BR /&gt; Application Categories loading...&lt;BR /&gt; DONE&lt;/P&gt;
&lt;P&gt;Logs of format 7.1.x NOT found&lt;BR /&gt;Logs of format 8.0.2 found&lt;BR /&gt;Logs of format 8.1.0-beta17 NOT found&lt;BR /&gt;Logs of format 8.1.0 NOT found&lt;BR /&gt;Size of trafficExtended: 50 MB&lt;BR /&gt;[Stage 44:&amp;gt; (0 + 3) / 3]Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 44.0 failed 1 times, most recent failure: Lost task 2.0 in stage 44.0 (TID 936, localhost, executor driver): org.apache.spark.SparkException: Failed to execute user defined function(anonfun$18: (string) =&amp;gt; bigint)&lt;BR /&gt; at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown Source)&lt;BR /&gt; at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)&lt;BR /&gt; at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)&lt;BR /&gt; at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)&lt;BR /&gt; at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)&lt;BR /&gt; at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)&lt;BR /&gt; at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)&lt;BR /&gt; at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)&lt;BR /&gt; at org.apache.spark.scheduler.Task.run(Task.scala:99)&lt;BR /&gt; at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)&lt;BR /&gt; at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)&lt;BR /&gt; at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)&lt;BR /&gt; at java.lang.Thread.run(Thread.java:748)&lt;BR /&gt;Caused by: java.lang.NumberFormatException: For input string: "2001:470:ba7e:20::254"&lt;BR /&gt; at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)&lt;BR /&gt; at java.lang.Integer.parseInt(Integer.java:580)&lt;BR /&gt; at java.lang.Integer.parseInt(Integer.java:615)&lt;BR /&gt; at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272)&lt;BR /&gt; at scala.collection.immutable.StringOps.toInt(StringOps.scala:29)&lt;BR /&gt; at com.paloaltonetworks.tbd.LogCollectorCompacter$.com$paloaltonetworks$tbd$LogCollectorCompacter$$IPv4ToLong$1(LogCollectorCompacter.scala:275)&lt;BR /&gt; at com.paloaltonetworks.tbd.LogCollectorCompacter$$anonfun$18.apply(LogCollectorCompacter.scala:886)&lt;BR /&gt; at com.paloaltonetworks.tbd.LogCollectorCompacter$$anonfun$18.apply(LogCollectorCompacter.scala:886)&lt;BR /&gt; ... 13 more&lt;/P&gt;
&lt;P&gt;Driver stacktrace:&lt;BR /&gt; at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)&lt;BR /&gt; at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)&lt;BR /&gt; at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)&lt;BR /&gt; at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)&lt;BR /&gt; at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)&lt;BR /&gt; at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422)&lt;BR /&gt; at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)&lt;BR /&gt; at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)&lt;BR /&gt; at scala.Option.foreach(Option.scala:257)&lt;BR /&gt; at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:802)&lt;BR /&gt; at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1650)&lt;BR /&gt; at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1605)&lt;BR /&gt; at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1594)&lt;BR /&gt; at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)&lt;BR /&gt; at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:628)&lt;BR /&gt; at org.apache.spark.SparkContext.runJob(SparkContext.scala:1925)&lt;BR /&gt; at org.apache.spark.SparkContext.runJob(SparkContext.scala:1938)&lt;BR /&gt; at org.apache.spark.SparkContext.runJob(SparkContext.scala:1951)&lt;BR /&gt; at org.apache.spark.SparkContext.runJob(SparkContext.scala:1965)&lt;BR /&gt; at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:936)&lt;BR /&gt; at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)&lt;BR /&gt; at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)&lt;BR /&gt; at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)&lt;BR /&gt; at org.apache.spark.rdd.RDD.collect(RDD.scala:935)&lt;BR /&gt; at com.paloaltonetworks.tbd.LogCollectorCompacter$.main(LogCollectorCompacter.scala:1039)&lt;BR /&gt; at com.paloaltonetworks.tbd.LogCollectorCompacter.main(LogCollectorCompacter.scala)&lt;BR /&gt; at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)&lt;BR /&gt; at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)&lt;BR /&gt; at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)&lt;BR /&gt; at java.lang.reflect.Method.invoke(Method.java:498)&lt;BR /&gt; at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)&lt;BR /&gt; at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)&lt;BR /&gt; at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)&lt;BR /&gt; at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)&lt;BR /&gt; at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)&lt;BR /&gt;Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$18: (string) =&amp;gt; bigint)&lt;BR /&gt; at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown Source)&lt;BR /&gt; at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)&lt;BR /&gt; at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)&lt;BR /&gt; at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)&lt;BR /&gt; at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)&lt;BR /&gt; at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)&lt;BR /&gt; at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)&lt;BR /&gt; at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)&lt;BR /&gt; at org.apache.spark.scheduler.Task.run(Task.scala:99)&lt;BR /&gt; at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)&lt;BR /&gt; at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)&lt;BR /&gt; at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)&lt;BR /&gt; at java.lang.Thread.run(Thread.java:748)&lt;BR /&gt;Caused by: java.lang.NumberFormatException: For input string: "2001:470:ba7e:20::254"&lt;BR /&gt; at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)&lt;BR /&gt; at java.lang.Integer.parseInt(Integer.java:580)&lt;BR /&gt; at java.lang.Integer.parseInt(Integer.java:615)&lt;BR /&gt; at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272)&lt;BR /&gt; at scala.collection.immutable.StringOps.toInt(StringOps.scala:29)&lt;BR /&gt; at com.paloaltonetworks.tbd.LogCollectorCompacter$.com$paloaltonetworks$tbd$LogCollectorCompacter$$IPv4ToLong$1(LogCollectorCompacter.scala:275)&lt;BR /&gt; at com.paloaltonetworks.tbd.LogCollectorCompacter$$anonfun$18.apply(LogCollectorCompacter.scala:886)&lt;BR /&gt; at com.paloaltonetworks.tbd.LogCollectorCompacter$$anonfun$18.apply(LogCollectorCompacter.scala:886)&lt;BR /&gt; ... 13 more&lt;BR /&gt;/var/backup/fw1_traffic_2018_08_17_last_calendar_day.csv&lt;/P&gt;</description>
      <pubDate>Fri, 17 Aug 2018 14:35:29 GMT</pubDate>
      <guid>https://live.paloaltonetworks.com/t5/expedition-discussions/ml-spark-exception-for-processing-logs-with-ipv6-addresses/m-p/227470#M382</guid>
      <dc:creator>Jon_Davis</dc:creator>
      <dc:date>2018-08-17T14:35:29Z</dc:date>
    </item>
    <item>
      <title>Re: ML Spark Exception for processing logs with IPv6 Addresses</title>
      <link>https://live.paloaltonetworks.com/t5/expedition-discussions/ml-spark-exception-for-processing-logs-with-ipv6-addresses/m-p/227472#M383</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;thanks we are working to fix it. By now ipv6 analisys will not be supported yet. But we will prevent spark to quit when an ipv6 address is found.&lt;/P&gt;</description>
      <pubDate>Fri, 17 Aug 2018 14:44:28 GMT</pubDate>
      <guid>https://live.paloaltonetworks.com/t5/expedition-discussions/ml-spark-exception-for-processing-logs-with-ipv6-addresses/m-p/227472#M383</guid>
      <dc:creator>alestevez</dc:creator>
      <dc:date>2018-08-17T14:44:28Z</dc:date>
    </item>
    <item>
      <title>Re: ML Spark Exception for processing logs with IPv6 Addresses</title>
      <link>https://live.paloaltonetworks.com/t5/expedition-discussions/ml-spark-exception-for-processing-logs-with-ipv6-addresses/m-p/229627#M472</link>
      <description>&lt;P&gt;Are we still working on being able to ingest IPv6 addressing in logs, or are you still working on having Spark not quit thus aborting the whole process via stacktrace? &amp;nbsp;It has been 3 weeks and I have not seen an update here. &amp;nbsp;I am running Expedition-Beta versoin 1.0.104.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in advance,&lt;/P&gt;</description>
      <pubDate>Wed, 05 Sep 2018 22:57:48 GMT</pubDate>
      <guid>https://live.paloaltonetworks.com/t5/expedition-discussions/ml-spark-exception-for-processing-logs-with-ipv6-addresses/m-p/229627#M472</guid>
      <dc:creator>aaron.howell</dc:creator>
      <dc:date>2018-09-05T22:57:48Z</dc:date>
    </item>
  </channel>
</rss>

