bulk load - Loading Cassandra data with SStableloader from different Cassandra cluster -
i have 2 different independent machines running cassandra , want migrate data 1 machine other.
thus, first took snapshot of cassandra cluster on machine 1 according datastax documentation.
then moved data machine 2, i'm trying import sstableloader.
as note: keypsace (open_weather) , tablename (raw_weather_data) on machine 2 have been created , same on machine 1.
the command i'm using looks follows:
bin/sstableloader -d localhost "path_to_snapshot"/open_weather/raw_weather_data and following error:
established connection initial hosts opening sstables , calculating sections stream input string: "compressioninfo.db" java.lang.numberformatexception: input string: "compressioninfo.db" @ java.lang.numberformatexception.forinputstring(numberformatexception.java:65) @ java.lang.integer.parseint(integer.java:580) @ java.lang.integer.parseint(integer.java:615) @ org.apache.cassandra.io.sstable.descriptor.fromfilename(descriptor.java:276) @ org.apache.cassandra.io.sstable.descriptor.fromfilename(descriptor.java:235) @ org.apache.cassandra.io.sstable.component.fromfilename(component.java:120) @ org.apache.cassandra.io.sstable.sstable.trycomponentfromfilename(sstable.java:160) @ org.apache.cassandra.io.sstable.sstableloader$1.accept(sstableloader.java:84) @ java.io.file.list(file.java:1161) @ org.apache.cassandra.io.sstable.sstableloader.opensstables(sstableloader.java:78) @ org.apache.cassandra.io.sstable.sstableloader.stream(sstableloader.java:162) @ org.apache.cassandra.tools.bulkloader.main(bulkloader.java:106) unfortunately have no idea why?
i'm not sure if related issue, somehow on machine 1 *.db files name rather "strange" compared *.db files have on machine 2.
*.db files machine 1:
la-53-big-compressioninfo.db la-53-big-data.db ... la-54-big-compressioninfo.db ... *.db files machine 2:
open_weather-raw_weather_data-ka-5-compressioninfo.db open_weather-raw_weather_data-ka-5-data.db what missing? highly appreciated. i'm open other suggestions. copy command not work since limited 99999999 rows far know.
p.s. didn't want create overly huge post, if need further information me out, let me know.
edit: note i'm using cassandra in stand-alone mode.
edit2: after installing same version 2.1.4 on destination machine (machine 2), still same error. sstableloader still above mentioned error , copying files manually (as described lhwizard), still empty tables after starting cassandra again , performing select command.
regarding initial tokens, huge list of tokens if perform node ring on machine 1. i'm not sure those?
your data in form of snapshot (or backup). have done in past following:
- install same version of cassandra on restore node
- edit cassandra.yaml on restore node - make sure cluster_name , snitch same.
- edit seeds: list , other properties altered in original node.
- get schema original node using cqlsh desc keyspace.
- start cassandra on restore node , import schema. (steps 6 & 7 may not necessary, do.)
- stop cassandra, delete contents of /var/lib/cassandra/data/, commitlog/, , saved_caches/* folders.
- restart cassandra on restore node recreate correct folders, stop it
- copy contents of snapshots folder each corresponding table folder in restore node, start cassandra. want run nodetool repair.
you don't need bulk import data, it's in correct format if using same version of cassandra, although didn't specify in original question.
Comments
Post a Comment