Class PartitionedDataLayer

  • All Implemented Interfaces:
    java.io.Serializable
    Direct Known Subclasses:
    CassandraDataLayer

    public abstract class PartitionedDataLayer
    extends DataLayer
    DataLayer that partitions token range by the number of Spark partitions and only lists SSTables overlapping with range
    See Also:
    Serialized Form
    • Field Detail

      • consistencyLevel

        @NotNull
        protected org.apache.cassandra.spark.data.partitioner.ConsistencyLevel consistencyLevel
      • datacenter

        protected java.lang.String datacenter
    • Constructor Detail

      • PartitionedDataLayer

        public PartitionedDataLayer​(@Nullable
                                    org.apache.cassandra.spark.data.partitioner.ConsistencyLevel consistencyLevel,
                                    @Nullable
                                    java.lang.String datacenter)
    • Method Detail

      • validateReplicationFactor

        protected void validateReplicationFactor​(@NotNull
                                                 org.apache.cassandra.spark.data.ReplicationFactor replicationFactor)
      • validateReplicationFactor

        public static void validateReplicationFactor​(@NotNull
                                                     org.apache.cassandra.spark.data.partitioner.ConsistencyLevel consistencyLevel,
                                                     @NotNull
                                                     org.apache.cassandra.spark.data.ReplicationFactor replicationFactor,
                                                     @Nullable
                                                     java.lang.String dc)
      • listInstance

        public abstract java.util.concurrent.CompletableFuture<java.util.stream.Stream<org.apache.cassandra.spark.data.SSTable>> listInstance​(int partitionId,
                                                                                                                                              @NotNull
                                                                                                                                              com.google.common.collect.Range<java.math.BigInteger> range,
                                                                                                                                              @NotNull
                                                                                                                                              org.apache.cassandra.spark.data.partitioner.CassandraInstance instance)
      • ring

        public abstract org.apache.cassandra.spark.data.partitioner.CassandraRing ring()
      • tokenPartitioner

        public abstract org.apache.cassandra.spark.data.partitioner.TokenPartitioner tokenPartitioner()
      • partitioner

        public org.apache.cassandra.spark.data.partitioner.Partitioner partitioner()
        Specified by:
        partitioner in class DataLayer
      • isInPartition

        public boolean isInPartition​(int partitionId,
                                     java.math.BigInteger token,
                                     java.nio.ByteBuffer key)
        Specified by:
        isInPartition in class DataLayer
      • sparkRangeFilter

        public org.apache.cassandra.spark.sparksql.filters.SparkRangeFilter sparkRangeFilter​(int partitionId)
        Description copied from class: DataLayer
        DataLayer implementation should provide a SparkRangeFilter to filter out partitions and mutations that do not overlap with the Spark worker's token range
        Overrides:
        sparkRangeFilter in class DataLayer
        Parameters:
        partitionId - the partitionId for the task
        Returns:
        SparkRangeFilter for the Spark worker's token range
      • partitionKeyFiltersInRange

        public java.util.List<org.apache.cassandra.spark.sparksql.filters.PartitionKeyFilter> partitionKeyFiltersInRange​(int partitionId,
                                                                                                                         java.util.List<org.apache.cassandra.spark.sparksql.filters.PartitionKeyFilter> filters)
                                                                                                                  throws org.apache.cassandra.spark.sparksql.NoMatchFoundException
        Overrides:
        partitionKeyFiltersInRange in class DataLayer
        Throws:
        org.apache.cassandra.spark.sparksql.NoMatchFoundException
      • consistencylevel

        public org.apache.cassandra.spark.data.partitioner.ConsistencyLevel consistencylevel()
      • sstables

        public org.apache.cassandra.spark.data.SSTablesSupplier sstables​(int partitionId,
                                                                         @Nullable
                                                                         org.apache.cassandra.spark.sparksql.filters.SparkRangeFilter sparkRangeFilter,
                                                                         @NotNull
                                                                         java.util.List<org.apache.cassandra.spark.sparksql.filters.PartitionKeyFilter> partitionKeyFilters)
        Specified by:
        sstables in class DataLayer
        Parameters:
        partitionId - the partitionId of the task
        sparkRangeFilter - spark range filter
        partitionKeyFilters - the list of partition key filters
        Returns:
        set of SSTables
      • filterNonIntersectingSSTables

        public boolean filterNonIntersectingSSTables()
        Overridable method setting whether the PartitionedDataLayer should filter out SSTables that do not intersect with the Spark partition token range
        Returns:
        true if we should filter
      • getAvailability

        protected PartitionedDataLayer.AvailabilityHint getAvailability​(org.apache.cassandra.spark.data.partitioner.CassandraInstance instance)
        Data Layer can override this method to hint availability of a Cassandra instance so Bulk Reader attempts UP instances first, and avoids instances known to be down e.g. if create snapshot request already failed
        Parameters:
        instance - a cassandra instance
        Returns:
        availability hint
      • replicationFactor

        public abstract org.apache.cassandra.spark.data.ReplicationFactor replicationFactor​(java.lang.String keyspace)
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object
      • equals

        public boolean equals​(java.lang.Object other)
        Overrides:
        equals in class java.lang.Object