Operating large files on ChameleonCloud

I primarily use Chameleon Cloud (CC) for my research projects. It provides great flexibility because I can run bare-metal servers (e.g., 44 threads/cores, 128G+ RAM) for a seven-day lease which is also renewable if the hosts I’m using are not booked by others. Its supporting team is also amazing.

But everything becomes slow if you are working with a really big dataset. For example, I’m working on a Telegram project and have 1TB+ data. This really gets me a headache. Well, the CC machines are able to handle this but need extra configurations.

Use Object Store (OS)

The OS can store up to 8TB data. The guide is clear. The advantage of using OS is that, I don’t have to upload the 1TB dataset every time when I start a new server. I can directly mount OS to the new server as a disk. But the issue is, the dataset needs to be split into smaller files (i.e., < 4G) for uploading. I was thinking, after uploading the 1TB file, I can merge the segments into one again. But the mounted OS is not a “real disk.” The reason behind is complicated and beyond my knowledge, but the consequence is clear–I can’t operate the segments like I’m using a “real disk.” I’ll not detail my failed attempts and frustration. It just can’t work! Like Gandalf is in front me: “You shall not pass!”

Mount Additional Disks

But I have to work that out. The strategy is that, mount a large real disk to the server, merge the segments and save the merged file on that real disk.

Mounting disk sounds like an easy task, but there are glitches all the time. This post is very helpful.

  • sudo lvmdiskscan lists all the available devices. Make sure using root; otherwise, information is limited and not helpful. This gives me messages like below. The /dev/nvme0n1-like path is the path to the device.
  /dev/nvme0n1                                                                                  [      <1.82 TiB]
  /dev/ceph-7416c6b0-b419-4227-b9a9-5ca48d295f90/osd-block-edc9fd01-90b8-4aaf-bcfb-ce263a5f72c6 [    <400.00 GiB]
  /dev/sda1                                                                                     [     558.91 GiB]
  /dev/ceph-bbcb4121-a89c-44f8-a321-ef02e6577f29/osd-block-23d73189-88cd-45b8-a315-a8a00979b82d [    <400.00 GiB]
  /dev/ceph-a905226d-5dea-49dd-ab2d-e9984fcdf9cb/osd-block-6c90347c-2758-46e4-a6f7-d5c5e84cf29c [     363.01 GiB]
  /dev/ceph-47613a0a-c021-40d6-aa63-2b0121ec2c1f/osd-block-effdb3b8-d2aa-47d1-bf41-58ae6253928d [    <158.91 GiB]
  /dev/ceph-6c63c94c-0b3a-4d41-b3e1-216ed9457527/osd-block-5d84ef65-97a7-4d8e-add0-9df1e6a8dde8 [    <200.00 GiB]
  /dev/nvme1n1                                                                                  [      <1.82 TiB]
  /dev/ceph-43dfb719-bda9-46d0-9a7a-4807522300c9/osd-block-a790ea25-2fcc-442d-8610-8cb3906b6915 [    <400.00 GiB]
  /dev/nvme1n1p1                                                                                [     500.00 GiB] LVM physical volume
  /dev/ceph-2fcca38b-ea7b-406c-bd77-f76da9e94194/osd-block-426c2997-8550-4fb6-b132-3471da6d5c14 [    <200.00 GiB]
  /dev/nvme1n1p2                                                                                [     500.00 GiB] LVM physical volume
  /dev/ceph-1b085445-bc2c-44a7-8294-93b87798717a/osd-block-fd344ea5-24ce-4f9f-a392-252897acb1e2 [    <500.00 GiB]
  /dev/nvme1n1p3                                                                                [     500.00 GiB] LVM physical volume
  /dev/ceph-0f2967c8-090a-40c0-8f10-a9baf44ca4ef/osd-block-8478ba73-8099-4ef7-a155-36459b1561dd [    <500.00 GiB]
  /dev/nvme1n1p4                                                                                [    <363.02 GiB] LVM physical volume
  /dev/ceph-04f7289a-53be-4637-9c60-a7049c6f0b90/osd-block-4d6d60ec-6b49-4f77-9e46-27071e27132c [    <500.00 GiB]
  /dev/ceph-813e17e9-dd1d-4a1f-b544-11c161b47ea2/osd-block-43fbe500-b12f-4eb2-80c1-4e576a9f048e [     588.49 GiB]
  /dev/ceph-b7bb80f7-f023-49a7-94aa-6ce06734cae2/osd-block-95705a63-e60d-43e8-8b13-3eebb00288e4 [     558.91 GiB]
  /dev/sdb1                                                                                     [     200.00 GiB] LVM physical volume
  /dev/sdb2                                                                                     [     200.00 GiB] LVM physical volume
  /dev/sdb3                                                                                     [     158.91 GiB] LVM physical volume
  /dev/sdc                                                                                      [     558.91 GiB] LVM physical volume
  /dev/sde1                                                                                     [     400.00 GiB] LVM physical volume
  /dev/sde2                                                                                     [     400.00 GiB] LVM physical volume
  /dev/sde3                                                                                     [     400.00 GiB] LVM physical volume
  /dev/sde4                                                                                     [    <588.50 GiB] LVM physical volume
  /dev/sdf1                                                                                     [      <1.75 TiB]
  6 disks
  10 partitions
  1 LVM physical volume whole disk
  11 LVM physical volumes
  • sudo lvscan -> sudo vgchange -ay I’m not sure the two commands really make some difference, just run them all.
  • Oh, yes, if you get an mount: wrong fs type, bad option, bad superblock error message, you probably need to create a file system on the disk with, e.g., mkfs.ext4 /dev/sdb1.
  • Mount the device to folders, for example mount /dev/sdb1 /root/data_store.
  • Remember, use root all the time.

Following the same rationale, mount as many (available) devices as you want, do the merge operation outside of the OS folder (I’m guessing working in cloudfuse folder will result in additional errors, don’t want to mess with my time), save to the mounted device. Now I have the following device:

Filesystem Size Used Avail Use% Mounted on
udev 252G 0 252G 0% /dev
tmpfs 51G 2.4M 51G 1% /run
/dev/sda1 550G 27G 501G 5% /
tmpfs 252G 0 252G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 252G 0 252G 0% /sys/fs/cgroup
tmpfs 51G 0 51G 0% /run/user/1000
tmpfs 51G 0 51G 0% /run/user/1010
/dev/sdf1 1.8T 1.1T 610G 64% /home/cc/hold
/dev/nvme0n1 1.8T 77M 1.7T 1% /root/tg_upload

The /dev/sdf1 holds all the 1TB zip file. I then extract the file to /dev/nvme0n1. You may ask, why don’t upload the file to server directly? Obviously because it’s slow–with 55MB+ uploading speed, it will take more than 5 hours to finish uploading.

OK, start a Jupyter Notebook under root. Now let’s get to work.

Leave a Reply