The graph above shows per-query performance for .NET for Apache Spark with Python and Scala. NET for Apache Spark runs well on Python and Scala. In addition, in cases where UDF performance is critical, such as query 1, where 3B rows of non-string data are passed between the JVM and CLR .NET, Apache Spark is 2x faster than Python.
It is also important to say that this is our first .NET release for Apache Spark, and we are aiming to invest further in improvements and benchmark performance (e.g. Arrow optimizations). You can follow our instructions to benchmark this on our GitHub repository.
.NET for Apache Spark is the first step in making .NET an important technology stack for building Big Data applications. Near-term planned path
Open source at /dotnet/spark