Discussion:
RadosGW objects to Rados object mapping
Abhishek L
2014-09-17 14:39:40 UTC
Permalink
Hi,

I'm trying to understand the internals of RadosGW, on how
buckets/containers, objects are mapped back to rados objects. I couldn't
find any docs, however a previous mailing list discussion[1] explained
how an S3/Swift objects are cut into rados objects and about manifests. I was
able to construct back a file uploaded to RadosGW by getting the rados
objects by using the manifest to figure out the rados object names.
For eg:
```
# random.txt is an 8 MB text file
[***@ra:~/ceph/src]$ s3 -us put my-first-bucket/random filename=random.txt
[***@ra:~/ceph/src]$ ./radosgw-admin object stat --bucket=my-first-bucket --object=random | grep prefix
"prefix": "._op2xmptte2DD7z3_9EjQKgmmRcWRWL_",

```

And then getting the objects via rados and joining back

```
[***@ra:~/ceph/src]$ ./rados --pool .rgw.buckets ls | grep _op2xm
default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2
default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1
[***@ra:~/ceph/src]$ ./rados get default.4124.1_random random.part0 --pool .rgw.buckets
[***@ra:~/ceph/src]$ ./rados get default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1 random.part1 --pool .rgw.buckets
[***@ra:~/ceph/src]$ ./rados get default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2 random.part2 --pool .rgw.buckets

# Now join the objects back
[***@ra:~/ceph/src]$ cat random.part0 random.part1 random.part2 > random.rados.txt
[***@ra:~/ceph/src]$ diff random.txt random.rados.txt
```

I'm trying to find similiar information on how radosgw ends up storing
the buckets & metadata into rados objects, what information is
contained within them and how they are updated when say an object is
added etc. I was able to find the bucket name & bucket meta data being
stored in .rgw pool, but not sure how the bucket knows the objects it
has or buckets owned by user etc.

[1] https://www.mail-archive.com/ceph-***@vger.kernel.org/msg19747.html

Thanks
--
Abhishek
Yehuda Sadeh
2014-09-17 15:52:38 UTC
Permalink
On Wed, Sep 17, 2014 at 7:39 AM, Abhishek L
Post by Abhishek L
Hi,
I'm trying to understand the internals of RadosGW, on how
buckets/containers, objects are mapped back to rados objects. I couldn't
find any docs, however a previous mailing list discussion[1] explained
how an S3/Swift objects are cut into rados objects and about manifests. I was
able to construct back a file uploaded to RadosGW by getting the rados
objects by using the manifest to figure out the rados object names.
```
# random.txt is an 8 MB text file
"prefix": "._op2xmptte2DD7z3_9EjQKgmmRcWRWL_",
```
And then getting the objects via rados and joining back
```
default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2
default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1
# Now join the objects back
```
I'm trying to find similiar information on how radosgw ends up storing
the buckets & metadata into rados objects, what information is
contained within them and how they are updated when say an object is
added etc. I was able to find the bucket name & bucket meta data being
stored in .rgw pool, but not sure how the bucket knows the objects it
has or buckets owned by user etc.
The bucket doesn't know who owns each object, this info is stored in
the object's info. The bucket index is stored as omap information in
the bucket instance object. The list of buckets per user is kept in
the user metadata object (also as omap information). There's a rados
command that lets you list the omap keys for each rados object.

Yehuda
Post by Abhishek L
Thanks
--
Abhishek
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Abhishek L
2014-09-17 18:23:29 UTC
Permalink
Post by Yehuda Sadeh
On Wed, Sep 17, 2014 at 7:39 AM, Abhishek L
Post by Abhishek L
Hi,
I'm trying to understand the internals of RadosGW, on how
buckets/containers, objects are mapped back to rados objects. I couldn't
find any docs, however a previous mailing list discussion[1] explained
how an S3/Swift objects are cut into rados objects and about manifests. I was
able to construct back a file uploaded to RadosGW by getting the rados
objects by using the manifest to figure out the rados object names.
```
# random.txt is an 8 MB text file
"prefix": "._op2xmptte2DD7z3_9EjQKgmmRcWRWL_",
```
And then getting the objects via rados and joining back
```
default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_2
default.4124.1__shadow_._op2xmptte2DD7z3_9EjQKgmmRcWRWL_1
# Now join the objects back
```
I'm trying to find similiar information on how radosgw ends up storing
the buckets & metadata into rados objects, what information is
contained within them and how they are updated when say an object is
added etc. I was able to find the bucket name & bucket meta data being
stored in .rgw pool, but not sure how the bucket knows the objects it
has or buckets owned by user etc.
The bucket doesn't know who owns each object, this info is stored in
the object's info. The bucket index is stored as omap information in
the bucket instance object.
Ah thanks, I was able to list the objects for the buckets, by getting
omapkeys from the buckets.index pool

```
[***@ra:~/ceph/src](⎇ master)$ ./rados -p .rgw.buckets.index ls
.dir.defualt.4124.2
.dir.default.4124.1

[***@ra:~/ceph/src](⎇ master)$ ./rados -p .rgw.buckets.index listomapkeys .dir.default.4124.1
big-object
file-1
object-8
random
```
Post by Yehuda Sadeh
The list of buckets per user is kept in
the user metadata object (also as omap information). There's a rados
command that lets you list the omap keys for each rados object.
This also I was able to get by inspecting the <uid>.buckets objects in
users.uid pool.

```

./rados -p .users.uid listomapkeys testid.buckets
another-bucket
my-first-bucket
```

Thanks for the info. I'll try to combine these mailing list discussions
to something of a starting point for storage in radosgw developer docs.

Cheers
--
Abhishek
Loading...