DataStax Academy FAQ

DataStax Academy migrated to a new learning management system (LMS) in July 2020. We are also moving to a new Cassandra Certification process so there are changes to exam bookings, voucher system and issuing of certificates.

Check out the Academy FAQ pages for answers to your questions:


question

ramesh4f_143215 avatar image
ramesh4f_143215 asked ·

Why is my cassandra-stress profile in DS210 not created?

Only the below profiles got created.

******************** Profile(s) ********************
  Keyspace Name: stresscql
  Keyspace CQL: 
***
CREATE KEYSPACE stresscql WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
***

  Table Name: blogposts
  Table CQL: 
***
CREATE TABLE blogposts (
      domain text,
      published_date timeuuid,
      url text,
      author text,
      title text,
      body text,
      PRIMARY KEY(domain, published_date)
) WITH CLUSTERING ORDER BY (published_date DESC) 
  AND compaction = { 'class':'LeveledCompactionStrategy' } 
  AND comment='A table to hold blog posts'
***

Why my profile is not created ?

run command

root@ds210-node2:/home/ubuntu/labwork# cassandra-stress user profile=/home/ubuntu/labwork/TestProfile.yaml ops\(insert=4,user_by_email=4\) -node ds210-node1

Cassandra-stress yaml file:

root@ds210-node2:/home/ubuntu/labwork# 
root@ds210-node2:/home/ubuntu/labwork# cat /home/ubuntu/labwork/TestProfile.yaml
#
# Keyspace Name
#
keyspace: killr_video 
keyspace_definition: | 
  CREATE KEYSPACE killr_video WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};

#
# Table name and create CQL 
#
table: user_by_email 
table_definition: | 
  CREATE TABLE user_by_email ( email TEXT, password TEXT, user_id UUID, PRIMARY KEY ((email)) )
#
# Keyspace Name
#
keyspace: stresscql
keyspace_definition: |
  CREATE KEYSPACE stresscql WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};

#
# Table name and create CQL
#
table: blogposts
table_definition: |
  CREATE TABLE blogposts (
        domain text,
        published_date timeuuid,
        url text,
        author text,
        title text,
        body text,
        PRIMARY KEY(domain, published_date)
  ) WITH CLUSTERING ORDER BY (published_date DESC) 
    AND compaction = { 'class':'LeveledCompactionStrategy' } 
    AND comment='A table to hold blog posts'

#
# Meta information for generating data
#
columnspec:
  - name: email 
    size: gaussian(8..30) 
    population: exp(1..1234)
  - name: password 
    size: exp(8..30) 
    population: uniform(1..1432) 
  - name: user_id 
    size: fixed(4) 
    population: uniform(1..1567)
columnspec:
  - name: domain
    size: gaussian(5..100)
    population: uniform(1..10M)
  - name: published_date
    cluster: fixed(1000)
  - name: url
    size: uniform(30..300)       
  - name: title
    size: gaussian(10..200)
  - name: author
    size: uniform(5..20)
  - name: body
    size: gaussian(100..5000)

# 
# Specs for insert queries 
# 
insert: 
  partitions: fixed(1) 
  batchtype: UNLOGGED # use unlogged batches 
  select: fixed(1)/1 
#
# Specs for insert queries
#
insert:
  partitions: fixed(1)
  select:    fixed(1)/1000
  batchtype: UNLOGGED             # use unlogged batches


#
# Read queries to run against the schema
#
queries: 
  user_by_email: 
      cql: select * from user_by_email where email = ? 
      fields: samerow
queries:
   singlepost:
      cql: select * from blogposts where domain = ? LIMIT 1 
      fields: samerow
root@ds210-node2:/home/ubuntu/labwork# 

I ran from the node2 and node1 is up.

Help me on fixing this..

academyds210cassandra-stress
2 comments
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

would you let me know which module or exercise you are studying, so I can look into it?

Thanks!

0 Likes 0 · ·
ramesh4f_143215 avatar image ramesh4f_143215 bettina.swynnerton ♦♦ ·

Course: DS210

CHAPTER: Cassandra-stress

0 Likes 0 · ·
Erick Ramirez avatar image
Erick Ramirez answered ·

Cause

It looks like you misunderstood the instructions. You have multiple keyspaces, tables, column specifications, etc in your configuration file.

Instructions

Step 7 of the exercise states:

This file contains a cassandra-stress user profile for a blogpost schema. You can use this profile to guide you, but you will want to modify it to suit your purposes.

Further more, step 8 of the exercise states:

In the schema section of TestProfile.yaml modify the two keyspace lines to name and create the killr_video keyspace.

Solution

To reiterate, you need to modify the profile in the configuration file -- NOT add to it. The profile in that test file was only provided as a guide.

If you followed the instructions correctly, you should end up with the following profile definition (comments omitted for brevity):

keyspace: killr_video
keyspace_definition: |
  CREATE KEYSPACE killr_video WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'};

table: user_by_email
table_definition: |
  CREATE TABLE user_by_email (
    email TEXT,
    password TEXT,
    user_id UUID,
    PRIMARY KEY ((email))
  )

columnspec:
  - name: email
    size: gaussian(8..30)
    population: exp(1..1000000)
  - name: password
    size: exp(8..30)
    population: uniform(1..1000000)
  - name: user_id
    size: fixed(4)
    population: uniform(1..1000000)

insert:
  partitions: fixed(1)
  batchtype: UNLOGGED       # use unlogged batches
  select: fixed(1)/1

queries:
   get_user:
      cql: select * from user_by_email where email = ?

Cheers!

4 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

I'm able to run the profile with the one keyspace/table. It worked.

I checked the node201 and I'm unable to see inserted records from cassandra-stress using the `select * from user_by_email`.

Is that something I missed here ?

on node 1:

cqlsh:killr_video> select * from user_by_email ;

 email | password | user_id
-------+----------+---------

(0 rows)
cqlsh:killr_video> desc user_by_email;

0 Likes 0 · ·

please ignore above. I didn't wait for the stress to gets complete. Its working now.

cqlsh:killr_video> select * from user_by_email ;

 email                                                    | password                               | user_id
----------------------------------------------------------+----------------------------------------+--------------------------------------
              :\rGB\x1a0\x0fO \x1d\x0e"\x0b\ug#s\x0b\x01? |                           dk;n*h{$qm\n | 00000000-0000-03d5-0000-000000000354
                             l\n\x12\x03hL\x181k\x1bB(V&X |          \x05\x0f\x12\x18\x1e..MeA\x15 | 00000000-0000-057c-0000-00000000021c
                         

Thanks, appreciate @Erick Ramirez @bettina.swynnerton for your quick response with good explanation.


So the bottomline is, it can be used for only one table creation.


Is that a kind of benchmarking tool? I basically from mysql bkground. sorry if its a silly query.

0 Likes 0 · ·

Cassandra-stress is a useful tool to stress your cassandra cluster with inserts and reads.

You might be interested in trying out NoSqlBench, a tool that we recently open-sourced. It is a great tool for performance testing, data model design, and sizing and deployment of new clusters and has a lot more capability than cassandra-stress.

Here is a blog post for introduction:

https://www.datastax.com/blog/2020/03/nosqlbench

We also have Katakoda exercises to get started:

https://katacoda.com/datastax/courses/nosqlbench-intro/nosqlbench

And this week, we are running workshops where we introduce the tool, check out the upcoming live streams in the Datastax Developers YouTube channel.

0 Likes 0 · ·

Hi @ramesh4f_143215,

I used the profile provided by Erick, and cassandra-stress is inserting data correctly.

If only one of your nodes received the data, check that your cluster is set up correctly, verify the cluster topology, and that the replication strategy defined for the keyspace killr_video matches the setup of your cluster.

0 Likes 0 · ·
bettina.swynnerton avatar image
bettina.swynnerton answered ·

HI @ramesh4f_143215,

I have studied your TestProfile.yaml in more detail, and I see that you are setting up a profile that writes into multiple keyspaces and tables, killr_video.user_by_email and stresscql.blogposts

This is not a supported feature of cassandra-stress for versions predating this fix (with Cassandra version 4.0)

https://issues.apache.org/jira/browse/CASSANDRA-8780

With the cassandra-stress version used in the DS210 course, you are only able to execute operations against one table at a time, and only the last keyspace definition and table definition is used when the profile is run.

If you want to write data into both tables, separate the profiles and run them separately.

I hope this helps to understand what you are seeing.

2 comments Share
10 |1000 characters needed characters left characters exceeded

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

So basically, cassandra-stress shouldn't support multiple table operations until today ?


Am I understood correctly ?

0 Likes 0 · ·

@ramesh4f_143215 I think you're missing the point. For the purposes of the DS201 course, you did not follow the instructions in the exercise which is why it's not working for you. The exercise is explicitly about stress-testing the user_by_email table. Cheers!

0 Likes 0 · ·