DataStax Academy FAQ

DataStax Academy migrated to a new learning management system (LMS) in July 2020. We are also moving to a new Cassandra Certification process so there are changes to exam bookings, voucher system and issuing of certificates.

Check out the Academy FAQ pages for answers to your questions:


question

andreasrimmelspacher_189038 avatar image
andreasrimmelspacher_189038 asked ·

Using a user defined type in user defined aggregate

Hello everybody,

I am new to Cassandra and I am trying to extract sketches (e.g. bloom filter) from some given data. While doing that, I came this far - my questions break down to the below:

CREATE TYPE bloomfilter_udt (
    n_as_sample_size int,
    m_as_number_of_buckets int,
    p_as_next_prime_above_m bigint,
    hash_for_string_coefficient_a list <bigint>,
    hash_for_number_coefficients_a list <bigint>,
    hash_for_number_coefficients_b list <bigint>,
    bloom_filter_as_map map<int, int>
);

CREATE OR REPLACE FUNCTION bloomfilter_udf (
    state bloomfilter_udt,
    value text,
    sample_size int
)
    CALLED ON NULL INPUT
    RETURNS bloomfilter_udt
    LANGUAGE java AS
        $$
        //fill state = bloomfilter_udt with some data
        return state;
        $$
    ;

CREATE OR REPLACE AGGREGATE bloomfilter_uda (
    text,
    int
)
    SFUNC bloomfilter_udf
    STYPE bloomfilter_udt
    INITCOND {};

1) When I call the aggregate, I would like to pass sample_size with a sub-query, e.g.

==> "SELECT bloomfilter_uda(name, (SELECT count(*) FROM test_table)) FROM test_table;" <==

Is that possible with Cassandra?

2) When I try to register the bloomfilter_uda, I get the following error:

==> InvalidRequest: Error from server: code=2200 [Invalid query] message="Invalid set literal for (dummy) of type bloomfilter_udt" <==

Can I just pass Cassandra data types as a state (map, list, set)?

3) If I assume, all of the above is my bad, how can I access the props of the state? Like

==> state.n_as_sample_size <==

Is this somehow possible?

I'd appreciate some help/hints.

Thanks

Andreas

cassandraudtudaudf
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

alexandre.dutra avatar image
alexandre.dutra answered ·
1) When I call the aggregate, I would like to pass sample_size with a sub-query [..] Is that possible with Cassandra?

No.


2) When I try to register the bloomfilter_uda, I get the following error [...] Can I just pass Cassandra data types as a state (map, list, set)?

Yes, but you need to input a valid literal for your UDT by initializing at least one field:

CREATE OR REPLACE AGGREGATE bloomfilter_uda ( text, int )
    SFUNC bloomfilter_udf
    STYPE bloomfilter_udt
    INITCOND { n_as_sample_size : 0 };

This is a subtlety of the CQL parser; if you input just {} the parser would be fooled into thinking that this is a set literal (an empty set).


3) If I assume, all of the above is my bad, how can I access the props of the state?

Inside functions and aggregates, if you need to access or modify a tuple or a user-defined type, you actually need to use the DataStax Java driver 3.x API for User-defined types:

CREATE OR REPLACE FUNCTION bloomfilter_udf (
    state bloomfilter_udt,
    value text,
    sample_size int
)
    CALLED ON NULL INPUT
    RETURNS bloomfilter_udt
    LANGUAGE java AS
        $$
        state.setInt("n_as_sample_size", 42);
        state.setInt("m_as_number_of_buckets" 42);
        state.setLong("p_as_next_prime_above_m", 4242L);
        List<Long> hashForStringCoefficients = ...;
        state.setList("hash_for_string_coefficient_a", 
            hashForStringCoefficients, Long.class);
        // etc.
        return state;
        $$   

The variable state inside the Java block is of type UDTValue.

1 comment Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Thanks, that's I was looking for. Is there any reason in this forum, why previous answers and updates on the question from my side are not shown?

0 Likes 0 · ·