WE HAVE SUNSET THIS LISTSERV - Join us at collectionspace@lyrasislists.org
View
all threads
KN
Kerr, Nathan
Mon, May 22, 2017 7:50 PM
Hi All,
The drive where we keep our cspace media (Nuxeo binaries) failed, we
recovered most of the data but are having an issue where various
derivatives of some blobs are missing. I am assuming that the files were
lost and not recovered when the drive was restored.
Is there some way to regenerate derivatives for existing blobs other than
removing and re-attaching the media via the ui? I couldn't find anything in
the documentation so thought I'd put it to the group.
Thanks,
Nathan
--
Nathan Kerr
nkerr@museumca.org
Hi All,
The drive where we keep our cspace media (Nuxeo binaries) failed, we
recovered most of the data but are having an issue where various
derivatives of some blobs are missing. I am assuming that the files were
lost and not recovered when the drive was restored.
Is there some way to regenerate derivatives for existing blobs other than
removing and re-attaching the media via the ui? I couldn't find anything in
the documentation so thought I'd put it to the group.
Thanks,
Nathan
--
*Nathan Kerr*
nkerr@museumca.org
JL
John Lowe
Mon, May 22, 2017 9:02 PM
Nathan,
There's no "system-internal" method for regenerating blobs that I know of,
but perhaps someone will chime in with a good idea.
I think you will have to remove the "damaged" Media records and then reload
the images.
Easiest might be to:
- Make a list of the CSIDs of damaged media records.
- write a script or otherwise wrap the CSIDs in REST requests to retrieve
the Original images
- write another script to DELETE these records via REST.
- Use the BMU (or other tool) to re-create the Media records and attach
them to images.
(Since the majority of your images were not uploading and related using the
BMU, there is probably a step to rename the image files to include the
accession number so the BMU can do the right thing...)
Do feel free to contact me directly if I can help further,
John
On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan nkerr@museumca.org wrote:
Hi All,
The drive where we keep our cspace media (Nuxeo binaries) failed, we
recovered most of the data but are having an issue where various
derivatives of some blobs are missing. I am assuming that the files were
lost and not recovered when the drive was restored.
Is there some way to regenerate derivatives for existing blobs other than
removing and re-attaching the media via the ui? I couldn't find anything in
the documentation so thought I'd put it to the group.
Thanks,
Nathan
--
Nathan Kerr
nkerr@museumca.org
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_
lists.collectionspace.org
Nathan,
There's no "system-internal" method for regenerating blobs that I know of,
but perhaps someone will chime in with a good idea.
I think you will have to remove the "damaged" Media records and then reload
the images.
Easiest *might* be to:
* Make a list of the CSIDs of damaged media records.
* write a script or otherwise wrap the CSIDs in REST requests to retrieve
the Original images
* write another script to DELETE these records via REST.
* Use the BMU (or other tool) to re-create the Media records and attach
them to images.
(Since the majority of your images were not uploading and related using the
BMU, there is probably a step to rename the image files to include the
accession number so the BMU can do the right thing...)
Do feel free to contact me directly if I can help further,
John
On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan <nkerr@museumca.org> wrote:
> Hi All,
>
> The drive where we keep our cspace media (Nuxeo binaries) failed, we
> recovered most of the data but are having an issue where various
> derivatives of some blobs are missing. I am assuming that the files were
> lost and not recovered when the drive was restored.
>
> Is there some way to regenerate derivatives for existing blobs other than
> removing and re-attaching the media via the ui? I couldn't find anything in
> the documentation so thought I'd put it to the group.
>
> Thanks,
> Nathan
>
> --
> *Nathan Kerr*
> nkerr@museumca.org
>
>
> _______________________________________________
> Talk mailing list
> Talk@lists.collectionspace.org
> http://lists.collectionspace.org/mailman/listinfo/talk_
> lists.collectionspace.org
>
>
RL
Ray Lee
Tue, May 23, 2017 1:43 AM
Hi Nathan,
First I'd want to verify that the files are missing. Maybe they just have
incorrect permissions, or your problem is actually in the database.
First find the filename of a blob that exhibits the problem (in this
example, page1.png). Find the repositoryid in blobs_common:
nuxeo_default=# select repositoryid from blobs_common where name =
'page1.png';
repositoryid
b76736a8-d212-49a7-a08f-8c52bcaf4e20
(1 row)
This should return one row. Find the derivatives, using the repositoryid:
nuxeo_default=# select view.* from hierarchy h inner join view on view.id =
h.id where h.parentid = 'b76736a8-d212-49a7-a08f-8c52bcaf4e20';
id | filename | width |
description | tag | title | height
--------------------------------------+------------------------+-------+---------------------+-----+--------------+--------
858e8ad6-fd20-4928-a896-b86c17653d02 | Medium_page1.jpg | 550 |
Medium size | | Medium | 550
4758b4b0-822a-46a6-ae89-93d5c292ddd0 | Thumbnail_page1.jpg | 100 |
Thumbnail size | | Thumbnail | 100
59f7c8e7-6717-4a43-980d-8264f1d24bf1 | Small_page1.jpg | 280 |
Small size | | Small | 280
afb2493d-300b-4a48-9701-0e1af9115486 | OriginalJpeg_page1.jpg | 600 |
Original jpeg image | | OriginalJpeg | 600
(4 rows)
There should be a medium derivative. Find the content, using its id:
nuxeo_default=# select content.* from hierarchy h inner join content on
content.id = h.id where h.parentid = '858e8ad6-fd20-4928-a896-b86c17653d02';
id | mime-type | data
| name | length | digest |
encoding
--------------------------------------+------------+----------------------------------+------------------+--------+----------------------------------+----------
ce9cc335-c225-4a3a-b271-50929364f969 | image/jpeg |
a55dc8a033eece421a0d2bf120960a91 | Medium_page1.jpg | 6915 |
a55dc8a033eece421a0d2bf120960a91 |
(1 row)
The data column contains the name of the file that should exist in the
filesystem. In this example, I would expect to find the file
a5/5d/a55dc8a033eece421a0d2bf120960a91 in the nuxeo binary data store (note
the two-levels of directory hashing, using the first four digits of the
hash).
If the file is missing, I would next try to determine the scope of the
problem. You could write a script to run through the content table, and see
how many files are missing. That might influence how I would recommend
fixing it.
Ray
On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan nkerr@museumca.org wrote:
Hi All,
The drive where we keep our cspace media (Nuxeo binaries) failed, we
recovered most of the data but are having an issue where various
derivatives of some blobs are missing. I am assuming that the files were
lost and not recovered when the drive was restored.
Is there some way to regenerate derivatives for existing blobs other than
removing and re-attaching the media via the ui? I couldn't find anything in
the documentation so thought I'd put it to the group.
Thanks,
Nathan
--
Nathan Kerr
nkerr@museumca.org
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_
lists.collectionspace.org
Hi Nathan,
First I'd want to verify that the files are missing. Maybe they just have
incorrect permissions, or your problem is actually in the database.
First find the filename of a blob that exhibits the problem (in this
example, page1.png). Find the repositoryid in blobs_common:
nuxeo_default=# select repositoryid from blobs_common where name =
'page1.png';
repositoryid
--------------------------------------
b76736a8-d212-49a7-a08f-8c52bcaf4e20
(1 row)
This should return one row. Find the derivatives, using the repositoryid:
nuxeo_default=# select view.* from hierarchy h inner join view on view.id =
h.id where h.parentid = 'b76736a8-d212-49a7-a08f-8c52bcaf4e20';
id | filename | width |
description | tag | title | height
--------------------------------------+------------------------+-------+---------------------+-----+--------------+--------
858e8ad6-fd20-4928-a896-b86c17653d02 | Medium_page1.jpg | 550 |
Medium size | | Medium | 550
4758b4b0-822a-46a6-ae89-93d5c292ddd0 | Thumbnail_page1.jpg | 100 |
Thumbnail size | | Thumbnail | 100
59f7c8e7-6717-4a43-980d-8264f1d24bf1 | Small_page1.jpg | 280 |
Small size | | Small | 280
afb2493d-300b-4a48-9701-0e1af9115486 | OriginalJpeg_page1.jpg | 600 |
Original jpeg image | | OriginalJpeg | 600
(4 rows)
There should be a medium derivative. Find the content, using its id:
nuxeo_default=# select content.* from hierarchy h inner join content on
content.id = h.id where h.parentid = '858e8ad6-fd20-4928-a896-b86c17653d02';
id | mime-type | data
| name | length | digest |
encoding
--------------------------------------+------------+----------------------------------+------------------+--------+----------------------------------+----------
ce9cc335-c225-4a3a-b271-50929364f969 | image/jpeg |
a55dc8a033eece421a0d2bf120960a91 | Medium_page1.jpg | 6915 |
a55dc8a033eece421a0d2bf120960a91 |
(1 row)
The data column contains the name of the file that should exist in the
filesystem. In this example, I would expect to find the file
a5/5d/a55dc8a033eece421a0d2bf120960a91 in the nuxeo binary data store (note
the two-levels of directory hashing, using the first four digits of the
hash).
If the file is missing, I would next try to determine the scope of the
problem. You could write a script to run through the content table, and see
how many files are missing. That might influence how I would recommend
fixing it.
Ray
On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan <nkerr@museumca.org> wrote:
> Hi All,
>
> The drive where we keep our cspace media (Nuxeo binaries) failed, we
> recovered most of the data but are having an issue where various
> derivatives of some blobs are missing. I am assuming that the files were
> lost and not recovered when the drive was restored.
>
> Is there some way to regenerate derivatives for existing blobs other than
> removing and re-attaching the media via the ui? I couldn't find anything in
> the documentation so thought I'd put it to the group.
>
> Thanks,
> Nathan
>
> --
> *Nathan Kerr*
> nkerr@museumca.org
>
>
> _______________________________________________
> Talk mailing list
> Talk@lists.collectionspace.org
> http://lists.collectionspace.org/mailman/listinfo/talk_
> lists.collectionspace.org
>
>
KN
Kerr, Nathan
Tue, May 23, 2017 6:28 PM
Thanks John and Ray.
Ray - I know some files are missing because I have found at least one
missing dir in the filesystem (getting the hash from the error logs), but
was unsure how to actually determine the scope of the issue so this is
really helpful!
On Mon, May 22, 2017 at 9:43 PM, Ray Lee rhlee@berkeley.edu wrote:
Hi Nathan,
First I'd want to verify that the files are missing. Maybe they just have
incorrect permissions, or your problem is actually in the database.
First find the filename of a blob that exhibits the problem (in this
example, page1.png). Find the repositoryid in blobs_common:
nuxeo_default=# select repositoryid from blobs_common where name =
'page1.png';
repositoryid
b76736a8-d212-49a7-a08f-8c52bcaf4e20
(1 row)
This should return one row. Find the derivatives, using the repositoryid:
nuxeo_default=# select view.* from hierarchy h inner join view on view.id
= h.id where h.parentid = 'b76736a8-d212-49a7-a08f-8c52bcaf4e20';
id | filename | width |
description | tag | title | height
--------------------------------------+---------------------
---+-------+---------------------+-----+--------------+--------
858e8ad6-fd20-4928-a896-b86c17653d02 | Medium_page1.jpg | 550 |
Medium size | | Medium | 550
4758b4b0-822a-46a6-ae89-93d5c292ddd0 | Thumbnail_page1.jpg | 100 |
Thumbnail size | | Thumbnail | 100
59f7c8e7-6717-4a43-980d-8264f1d24bf1 | Small_page1.jpg | 280 |
Small size | | Small | 280
afb2493d-300b-4a48-9701-0e1af9115486 | OriginalJpeg_page1.jpg | 600 |
Original jpeg image | | OriginalJpeg | 600
(4 rows)
There should be a medium derivative. Find the content, using its id:
nuxeo_default=# select content.* from hierarchy h inner join content on
content.id = h.id where h.parentid = '858e8ad6-fd20-4928-a896-
b86c17653d02';
id | mime-type | data
| name | length | digest
| encoding
--------------------------------------+------------+--------
--------------------------+------------------+--------+-----
-----------------------------+----------
ce9cc335-c225-4a3a-b271-50929364f969 | image/jpeg |
a55dc8a033eece421a0d2bf120960a91 | Medium_page1.jpg | 6915 |
a55dc8a033eece421a0d2bf120960a91 |
(1 row)
The data column contains the name of the file that should exist in the
filesystem. In this example, I would expect to find the file a5/5d/
a55dc8a033eece421a0d2bf120960a91 in the nuxeo binary data store (note the
two-levels of directory hashing, using the first four digits of the hash).
If the file is missing, I would next try to determine the scope of the
problem. You could write a script to run through the content table, and see
how many files are missing. That might influence how I would recommend
fixing it.
Ray
On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan nkerr@museumca.org wrote:
Hi All,
The drive where we keep our cspace media (Nuxeo binaries) failed, we
recovered most of the data but are having an issue where various
derivatives of some blobs are missing. I am assuming that the files were
lost and not recovered when the drive was restored.
Is there some way to regenerate derivatives for existing blobs other than
removing and re-attaching the media via the ui? I couldn't find anything in
the documentation so thought I'd put it to the group.
Thanks,
Nathan
--
Nathan Kerr
nkerr@museumca.org
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists
.collectionspace.org
--
Nathan Kerr
Digital Resource Program Manager
Oakland Museum of California
1000 Oak St
Oakland CA, 94607
510-318-8494
nkerr@museumca.org
Thanks John and Ray.
Ray - I know some files are missing because I have found at least one
missing dir in the filesystem (getting the hash from the error logs), but
was unsure how to actually determine the scope of the issue so this is
really helpful!
On Mon, May 22, 2017 at 9:43 PM, Ray Lee <rhlee@berkeley.edu> wrote:
> Hi Nathan,
> First I'd want to verify that the files are missing. Maybe they just have
> incorrect permissions, or your problem is actually in the database.
>
> First find the filename of a blob that exhibits the problem (in this
> example, page1.png). Find the repositoryid in blobs_common:
>
> nuxeo_default=# select repositoryid from blobs_common where name =
> 'page1.png';
> repositoryid
> --------------------------------------
> b76736a8-d212-49a7-a08f-8c52bcaf4e20
> (1 row)
>
> This should return one row. Find the derivatives, using the repositoryid:
>
> nuxeo_default=# select view.* from hierarchy h inner join view on view.id
> = h.id where h.parentid = 'b76736a8-d212-49a7-a08f-8c52bcaf4e20';
> id | filename | width |
> description | tag | title | height
> --------------------------------------+---------------------
> ---+-------+---------------------+-----+--------------+--------
> 858e8ad6-fd20-4928-a896-b86c17653d02 | Medium_page1.jpg | 550 |
> Medium size | | Medium | 550
> 4758b4b0-822a-46a6-ae89-93d5c292ddd0 | Thumbnail_page1.jpg | 100 |
> Thumbnail size | | Thumbnail | 100
> 59f7c8e7-6717-4a43-980d-8264f1d24bf1 | Small_page1.jpg | 280 |
> Small size | | Small | 280
> afb2493d-300b-4a48-9701-0e1af9115486 | OriginalJpeg_page1.jpg | 600 |
> Original jpeg image | | OriginalJpeg | 600
> (4 rows)
>
> There should be a medium derivative. Find the content, using its id:
>
> nuxeo_default=# select content.* from hierarchy h inner join content on
> content.id = h.id where h.parentid = '858e8ad6-fd20-4928-a896-
> b86c17653d02';
> id | mime-type | data
> | name | length | digest
> | encoding
> --------------------------------------+------------+--------
> --------------------------+------------------+--------+-----
> -----------------------------+----------
> ce9cc335-c225-4a3a-b271-50929364f969 | image/jpeg |
> a55dc8a033eece421a0d2bf120960a91 | Medium_page1.jpg | 6915 |
> a55dc8a033eece421a0d2bf120960a91 |
> (1 row)
>
> The data column contains the name of the file that should exist in the
> filesystem. In this example, I would expect to find the file a5/5d/
> a55dc8a033eece421a0d2bf120960a91 in the nuxeo binary data store (note the
> two-levels of directory hashing, using the first four digits of the hash).
>
> If the file is missing, I would next try to determine the scope of the
> problem. You could write a script to run through the content table, and see
> how many files are missing. That might influence how I would recommend
> fixing it.
>
> Ray
>
> On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan <nkerr@museumca.org> wrote:
>
>> Hi All,
>>
>> The drive where we keep our cspace media (Nuxeo binaries) failed, we
>> recovered most of the data but are having an issue where various
>> derivatives of some blobs are missing. I am assuming that the files were
>> lost and not recovered when the drive was restored.
>>
>> Is there some way to regenerate derivatives for existing blobs other than
>> removing and re-attaching the media via the ui? I couldn't find anything in
>> the documentation so thought I'd put it to the group.
>>
>> Thanks,
>> Nathan
>>
>> --
>> *Nathan Kerr*
>> nkerr@museumca.org
>>
>>
>> _______________________________________________
>> Talk mailing list
>> Talk@lists.collectionspace.org
>> http://lists.collectionspace.org/mailman/listinfo/talk_lists
>> .collectionspace.org
>>
>>
>
--
*Nathan Kerr*
Digital Resource Program Manager
Oakland Museum of California
1000 Oak St
Oakland CA, 94607
510-318-8494
nkerr@museumca.org
YN
Yousuf Nejati
Tue, May 23, 2017 6:43 PM
From my experience, you can have missing blobs in the nuxeo-data directory
backup if the CollectionSpace server is not shutdown when a snapshot is
taken for the backups. Maybe that has something to do with it?
Good luck!
-Yousuf
yousuf.cspace@gmail.com
yousufnejati.com/collectionspace
On Tue, May 23, 2017 at 11:28 AM, Kerr, Nathan nkerr@museumca.org wrote:
Thanks John and Ray.
Ray - I know some files are missing because I have found at least one
missing dir in the filesystem (getting the hash from the error logs), but
was unsure how to actually determine the scope of the issue so this is
really helpful!
On Mon, May 22, 2017 at 9:43 PM, Ray Lee rhlee@berkeley.edu wrote:
Hi Nathan,
First I'd want to verify that the files are missing. Maybe they just have
incorrect permissions, or your problem is actually in the database.
First find the filename of a blob that exhibits the problem (in this
example, page1.png). Find the repositoryid in blobs_common:
nuxeo_default=# select repositoryid from blobs_common where name =
'page1.png';
repositoryid
b76736a8-d212-49a7-a08f-8c52bcaf4e20
(1 row)
This should return one row. Find the derivatives, using the repositoryid:
nuxeo_default=# select view.* from hierarchy h inner join view on view.id
= h.id where h.parentid = 'b76736a8-d212-49a7-a08f-8c52bcaf4e20';
id | filename | width |
description | tag | title | height
--------------------------------------+---------------------
---+-------+---------------------+-----+--------------+--------
858e8ad6-fd20-4928-a896-b86c17653d02 | Medium_page1.jpg | 550 |
Medium size | | Medium | 550
4758b4b0-822a-46a6-ae89-93d5c292ddd0 | Thumbnail_page1.jpg | 100 |
Thumbnail size | | Thumbnail | 100
59f7c8e7-6717-4a43-980d-8264f1d24bf1 | Small_page1.jpg | 280 |
Small size | | Small | 280
afb2493d-300b-4a48-9701-0e1af9115486 | OriginalJpeg_page1.jpg | 600 |
Original jpeg image | | OriginalJpeg | 600
(4 rows)
There should be a medium derivative. Find the content, using its id:
nuxeo_default=# select content.* from hierarchy h inner join content on
content.id = h.id where h.parentid = '858e8ad6-fd20-4928-a896-b86c1
7653d02';
id | mime-type | data
| name | length | digest
| encoding
--------------------------------------+------------+--------
--------------------------+------------------+--------+-----
-----------------------------+----------
ce9cc335-c225-4a3a-b271-50929364f969 | image/jpeg |
a55dc8a033eece421a0d2bf120960a91 | Medium_page1.jpg | 6915 |
a55dc8a033eece421a0d2bf120960a91 |
(1 row)
The data column contains the name of the file that should exist in the
filesystem. In this example, I would expect to find the file
a5/5d/a55dc8a033eece421a0d2bf120960a91 in the nuxeo binary data store
(note the two-levels of directory hashing, using the first four digits of
the hash).
If the file is missing, I would next try to determine the scope of the
problem. You could write a script to run through the content table, and see
how many files are missing. That might influence how I would recommend
fixing it.
Ray
On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan nkerr@museumca.org
wrote:
Hi All,
The drive where we keep our cspace media (Nuxeo binaries) failed, we
recovered most of the data but are having an issue where various
derivatives of some blobs are missing. I am assuming that the files were
lost and not recovered when the drive was restored.
Is there some way to regenerate derivatives for existing blobs other
than removing and re-attaching the media via the ui? I couldn't find
anything in the documentation so thought I'd put it to the group.
Thanks,
Nathan
--
Nathan Kerr
nkerr@museumca.org
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists
.collectionspace.org
>From my experience, you can have missing blobs in the nuxeo-data directory
backup if the CollectionSpace server is not shutdown when a snapshot is
taken for the backups. Maybe that has something to do with it?
Good luck!
-Yousuf
yousuf.cspace@gmail.com
yousufnejati.com/collectionspace
On Tue, May 23, 2017 at 11:28 AM, Kerr, Nathan <nkerr@museumca.org> wrote:
> Thanks John and Ray.
>
> Ray - I know some files are missing because I have found at least one
> missing dir in the filesystem (getting the hash from the error logs), but
> was unsure how to actually determine the scope of the issue so this is
> really helpful!
>
>
>
> On Mon, May 22, 2017 at 9:43 PM, Ray Lee <rhlee@berkeley.edu> wrote:
>
>> Hi Nathan,
>> First I'd want to verify that the files are missing. Maybe they just have
>> incorrect permissions, or your problem is actually in the database.
>>
>> First find the filename of a blob that exhibits the problem (in this
>> example, page1.png). Find the repositoryid in blobs_common:
>>
>> nuxeo_default=# select repositoryid from blobs_common where name =
>> 'page1.png';
>> repositoryid
>> --------------------------------------
>> b76736a8-d212-49a7-a08f-8c52bcaf4e20
>> (1 row)
>>
>> This should return one row. Find the derivatives, using the repositoryid:
>>
>> nuxeo_default=# select view.* from hierarchy h inner join view on view.id
>> = h.id where h.parentid = 'b76736a8-d212-49a7-a08f-8c52bcaf4e20';
>> id | filename | width |
>> description | tag | title | height
>> --------------------------------------+---------------------
>> ---+-------+---------------------+-----+--------------+--------
>> 858e8ad6-fd20-4928-a896-b86c17653d02 | Medium_page1.jpg | 550 |
>> Medium size | | Medium | 550
>> 4758b4b0-822a-46a6-ae89-93d5c292ddd0 | Thumbnail_page1.jpg | 100 |
>> Thumbnail size | | Thumbnail | 100
>> 59f7c8e7-6717-4a43-980d-8264f1d24bf1 | Small_page1.jpg | 280 |
>> Small size | | Small | 280
>> afb2493d-300b-4a48-9701-0e1af9115486 | OriginalJpeg_page1.jpg | 600 |
>> Original jpeg image | | OriginalJpeg | 600
>> (4 rows)
>>
>> There should be a medium derivative. Find the content, using its id:
>>
>> nuxeo_default=# select content.* from hierarchy h inner join content on
>> content.id = h.id where h.parentid = '858e8ad6-fd20-4928-a896-b86c1
>> 7653d02';
>> id | mime-type | data
>> | name | length | digest
>> | encoding
>> --------------------------------------+------------+--------
>> --------------------------+------------------+--------+-----
>> -----------------------------+----------
>> ce9cc335-c225-4a3a-b271-50929364f969 | image/jpeg |
>> a55dc8a033eece421a0d2bf120960a91 | Medium_page1.jpg | 6915 |
>> a55dc8a033eece421a0d2bf120960a91 |
>> (1 row)
>>
>> The data column contains the name of the file that should exist in the
>> filesystem. In this example, I would expect to find the file
>> a5/5d/a55dc8a033eece421a0d2bf120960a91 in the nuxeo binary data store
>> (note the two-levels of directory hashing, using the first four digits of
>> the hash).
>>
>> If the file is missing, I would next try to determine the scope of the
>> problem. You could write a script to run through the content table, and see
>> how many files are missing. That might influence how I would recommend
>> fixing it.
>>
>> Ray
>>
>> On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan <nkerr@museumca.org>
>> wrote:
>>
>>> Hi All,
>>>
>>> The drive where we keep our cspace media (Nuxeo binaries) failed, we
>>> recovered most of the data but are having an issue where various
>>> derivatives of some blobs are missing. I am assuming that the files were
>>> lost and not recovered when the drive was restored.
>>>
>>> Is there some way to regenerate derivatives for existing blobs other
>>> than removing and re-attaching the media via the ui? I couldn't find
>>> anything in the documentation so thought I'd put it to the group.
>>>
>>> Thanks,
>>> Nathan
>>>
>>> --
>>> *Nathan Kerr*
>>> nkerr@museumca.org
>>>
>>>
>>> _______________________________________________
>>> Talk mailing list
>>> Talk@lists.collectionspace.org
>>> http://lists.collectionspace.org/mailman/listinfo/talk_lists
>>> .collectionspace.org
>>>
>>>
>>
>
>
> --
> *Nathan Kerr*
> Digital Resource Program Manager
> Oakland Museum of California
> 1000 Oak St
> Oakland CA, 94607
> 510-318-8494 <(510)%20318-8494>
> nkerr@museumca.org
>
>
> _______________________________________________
> Talk mailing list
> Talk@lists.collectionspace.org
> http://lists.collectionspace.org/mailman/listinfo/talk_
> lists.collectionspace.org
>
>
--
*Yousuf D. Nejati*
RL
Ray Lee
Tue, May 23, 2017 9:44 PM
Once you know the names of your missing files (the long hashes from the
view table), check the convertcache to see if they're still there.
Typically this is in nuxeo-server/data/convertcache in your tomcat
directory. This cache is supposed to be cleared out periodically, but in
older versions of CSpace it was allowed to grow indefinitely (CSPACE-6711
https://issues.collectionspace.org/browse/CSPACE-6711), which would
actually be useful in this situation. The files in convertcache are hashed
into weird directory names, but you can do a find.
Ray
On Tue, May 23, 2017 at 11:28 AM, Kerr, Nathan nkerr@museumca.org wrote:
Thanks John and Ray.
Ray - I know some files are missing because I have found at least one
missing dir in the filesystem (getting the hash from the error logs), but
was unsure how to actually determine the scope of the issue so this is
really helpful!
On Mon, May 22, 2017 at 9:43 PM, Ray Lee rhlee@berkeley.edu wrote:
Hi Nathan,
First I'd want to verify that the files are missing. Maybe they just have
incorrect permissions, or your problem is actually in the database.
First find the filename of a blob that exhibits the problem (in this
example, page1.png). Find the repositoryid in blobs_common:
nuxeo_default=# select repositoryid from blobs_common where name =
'page1.png';
repositoryid
b76736a8-d212-49a7-a08f-8c52bcaf4e20
(1 row)
This should return one row. Find the derivatives, using the repositoryid:
nuxeo_default=# select view.* from hierarchy h inner join view on view.id
= h.id where h.parentid = 'b76736a8-d212-49a7-a08f-8c52bcaf4e20';
id | filename | width |
description | tag | title | height
--------------------------------------+---------------------
---+-------+---------------------+-----+--------------+--------
858e8ad6-fd20-4928-a896-b86c17653d02 | Medium_page1.jpg | 550 |
Medium size | | Medium | 550
4758b4b0-822a-46a6-ae89-93d5c292ddd0 | Thumbnail_page1.jpg | 100 |
Thumbnail size | | Thumbnail | 100
59f7c8e7-6717-4a43-980d-8264f1d24bf1 | Small_page1.jpg | 280 |
Small size | | Small | 280
afb2493d-300b-4a48-9701-0e1af9115486 | OriginalJpeg_page1.jpg | 600 |
Original jpeg image | | OriginalJpeg | 600
(4 rows)
There should be a medium derivative. Find the content, using its id:
nuxeo_default=# select content.* from hierarchy h inner join content on
content.id = h.id where h.parentid = '858e8ad6-fd20-4928-a896-b86c1
7653d02';
id | mime-type | data
| name | length | digest
| encoding
--------------------------------------+------------+--------
--------------------------+------------------+--------+-----
-----------------------------+----------
ce9cc335-c225-4a3a-b271-50929364f969 | image/jpeg |
a55dc8a033eece421a0d2bf120960a91 | Medium_page1.jpg | 6915 |
a55dc8a033eece421a0d2bf120960a91 |
(1 row)
The data column contains the name of the file that should exist in the
filesystem. In this example, I would expect to find the file
a5/5d/a55dc8a033eece421a0d2bf120960a91 in the nuxeo binary data store
(note the two-levels of directory hashing, using the first four digits of
the hash).
If the file is missing, I would next try to determine the scope of the
problem. You could write a script to run through the content table, and see
how many files are missing. That might influence how I would recommend
fixing it.
Ray
On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan nkerr@museumca.org
wrote:
Hi All,
The drive where we keep our cspace media (Nuxeo binaries) failed, we
recovered most of the data but are having an issue where various
derivatives of some blobs are missing. I am assuming that the files were
lost and not recovered when the drive was restored.
Is there some way to regenerate derivatives for existing blobs other
than removing and re-attaching the media via the ui? I couldn't find
anything in the documentation so thought I'd put it to the group.
Thanks,
Nathan
--
Nathan Kerr
nkerr@museumca.org
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists
.collectionspace.org
--
Nathan Kerr
Digital Resource Program Manager
Oakland Museum of California
1000 Oak St
Oakland CA, 94607
510-318-8494 <(510)%20318-8494>
nkerr@museumca.org
Once you know the names of your missing files (the long hashes from the
view table), check the convertcache to see if they're still there.
Typically this is in nuxeo-server/data/convertcache in your tomcat
directory. This cache is supposed to be cleared out periodically, but in
older versions of CSpace it was allowed to grow indefinitely (CSPACE-6711
<https://issues.collectionspace.org/browse/CSPACE-6711>), which would
actually be useful in this situation. The files in convertcache are hashed
into weird directory names, but you can do a find.
Ray
On Tue, May 23, 2017 at 11:28 AM, Kerr, Nathan <nkerr@museumca.org> wrote:
> Thanks John and Ray.
>
> Ray - I know some files are missing because I have found at least one
> missing dir in the filesystem (getting the hash from the error logs), but
> was unsure how to actually determine the scope of the issue so this is
> really helpful!
>
>
>
> On Mon, May 22, 2017 at 9:43 PM, Ray Lee <rhlee@berkeley.edu> wrote:
>
>> Hi Nathan,
>> First I'd want to verify that the files are missing. Maybe they just have
>> incorrect permissions, or your problem is actually in the database.
>>
>> First find the filename of a blob that exhibits the problem (in this
>> example, page1.png). Find the repositoryid in blobs_common:
>>
>> nuxeo_default=# select repositoryid from blobs_common where name =
>> 'page1.png';
>> repositoryid
>> --------------------------------------
>> b76736a8-d212-49a7-a08f-8c52bcaf4e20
>> (1 row)
>>
>> This should return one row. Find the derivatives, using the repositoryid:
>>
>> nuxeo_default=# select view.* from hierarchy h inner join view on view.id
>> = h.id where h.parentid = 'b76736a8-d212-49a7-a08f-8c52bcaf4e20';
>> id | filename | width |
>> description | tag | title | height
>> --------------------------------------+---------------------
>> ---+-------+---------------------+-----+--------------+--------
>> 858e8ad6-fd20-4928-a896-b86c17653d02 | Medium_page1.jpg | 550 |
>> Medium size | | Medium | 550
>> 4758b4b0-822a-46a6-ae89-93d5c292ddd0 | Thumbnail_page1.jpg | 100 |
>> Thumbnail size | | Thumbnail | 100
>> 59f7c8e7-6717-4a43-980d-8264f1d24bf1 | Small_page1.jpg | 280 |
>> Small size | | Small | 280
>> afb2493d-300b-4a48-9701-0e1af9115486 | OriginalJpeg_page1.jpg | 600 |
>> Original jpeg image | | OriginalJpeg | 600
>> (4 rows)
>>
>> There should be a medium derivative. Find the content, using its id:
>>
>> nuxeo_default=# select content.* from hierarchy h inner join content on
>> content.id = h.id where h.parentid = '858e8ad6-fd20-4928-a896-b86c1
>> 7653d02';
>> id | mime-type | data
>> | name | length | digest
>> | encoding
>> --------------------------------------+------------+--------
>> --------------------------+------------------+--------+-----
>> -----------------------------+----------
>> ce9cc335-c225-4a3a-b271-50929364f969 | image/jpeg |
>> a55dc8a033eece421a0d2bf120960a91 | Medium_page1.jpg | 6915 |
>> a55dc8a033eece421a0d2bf120960a91 |
>> (1 row)
>>
>> The data column contains the name of the file that should exist in the
>> filesystem. In this example, I would expect to find the file
>> a5/5d/a55dc8a033eece421a0d2bf120960a91 in the nuxeo binary data store
>> (note the two-levels of directory hashing, using the first four digits of
>> the hash).
>>
>> If the file is missing, I would next try to determine the scope of the
>> problem. You could write a script to run through the content table, and see
>> how many files are missing. That might influence how I would recommend
>> fixing it.
>>
>> Ray
>>
>> On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan <nkerr@museumca.org>
>> wrote:
>>
>>> Hi All,
>>>
>>> The drive where we keep our cspace media (Nuxeo binaries) failed, we
>>> recovered most of the data but are having an issue where various
>>> derivatives of some blobs are missing. I am assuming that the files were
>>> lost and not recovered when the drive was restored.
>>>
>>> Is there some way to regenerate derivatives for existing blobs other
>>> than removing and re-attaching the media via the ui? I couldn't find
>>> anything in the documentation so thought I'd put it to the group.
>>>
>>> Thanks,
>>> Nathan
>>>
>>> --
>>> *Nathan Kerr*
>>> nkerr@museumca.org
>>>
>>>
>>> _______________________________________________
>>> Talk mailing list
>>> Talk@lists.collectionspace.org
>>> http://lists.collectionspace.org/mailman/listinfo/talk_lists
>>> .collectionspace.org
>>>
>>>
>>
>
>
> --
> *Nathan Kerr*
> Digital Resource Program Manager
> Oakland Museum of California
> 1000 Oak St
> Oakland CA, 94607
> 510-318-8494 <(510)%20318-8494>
> nkerr@museumca.org
>
>
RL
Ray Lee
Fri, May 26, 2017 8:59 PM
Once you know the names of your missing files (the long hashes from the
view table), check the convertcache to see if they're still there.
Typically this is in nuxeo-server/data/convertcache in your tomcat
directory. This cache is supposed to be cleared out periodically, but in
older versions of CSpace it was allowed to grow indefinitely (CSPACE-6711
https://issues.collectionspace.org/browse/CSPACE-6711), which would
actually be useful in this situation. The files in convertcache are hashed
into weird directory names, but you can do a find.
Ray
On Tue, May 23, 2017 at 11:28 AM, Kerr, Nathan nkerr@museumca.org wrote:
Thanks John and Ray.
Ray - I know some files are missing because I have found at least one
missing dir in the filesystem (getting the hash from the error logs), but
was unsure how to actually determine the scope of the issue so this is
really helpful!
On Mon, May 22, 2017 at 9:43 PM, Ray Lee rhlee@berkeley.edu wrote:
Hi Nathan,
First I'd want to verify that the files are missing. Maybe they just
have incorrect permissions, or your problem is actually in the database.
First find the filename of a blob that exhibits the problem (in this
example, page1.png). Find the repositoryid in blobs_common:
nuxeo_default=# select repositoryid from blobs_common where name =
'page1.png';
repositoryid
b76736a8-d212-49a7-a08f-8c52bcaf4e20
(1 row)
This should return one row. Find the derivatives, using the repositoryid:
nuxeo_default=# select view.* from hierarchy h inner join view on
view.id = h.id where h.parentid = 'b76736a8-d212-49a7-a08f-8c52b
caf4e20';
id | filename | width |
description | tag | title | height
--------------------------------------+---------------------
---+-------+---------------------+-----+--------------+--------
858e8ad6-fd20-4928-a896-b86c17653d02 | Medium_page1.jpg | 550
| Medium size | | Medium | 550
4758b4b0-822a-46a6-ae89-93d5c292ddd0 | Thumbnail_page1.jpg | 100
| Thumbnail size | | Thumbnail | 100
59f7c8e7-6717-4a43-980d-8264f1d24bf1 | Small_page1.jpg | 280
| Small size | | Small | 280
afb2493d-300b-4a48-9701-0e1af9115486 | OriginalJpeg_page1.jpg | 600
| Original jpeg image | | OriginalJpeg | 600
(4 rows)
There should be a medium derivative. Find the content, using its id:
nuxeo_default=# select content.* from hierarchy h inner join content on
content.id = h.id where h.parentid = '858e8ad6-fd20-4928-a896-b86c1
7653d02';
id | mime-type | data
| name | length | digest
| encoding
--------------------------------------+------------+--------
--------------------------+------------------+--------+-----
-----------------------------+----------
ce9cc335-c225-4a3a-b271-50929364f969 | image/jpeg |
a55dc8a033eece421a0d2bf120960a91 | Medium_page1.jpg | 6915 |
a55dc8a033eece421a0d2bf120960a91 |
(1 row)
The data column contains the name of the file that should exist in the
filesystem. In this example, I would expect to find the file
a5/5d/a55dc8a033eece421a0d2bf120960a91 in the nuxeo binary data store
(note the two-levels of directory hashing, using the first four digits of
the hash).
If the file is missing, I would next try to determine the scope of the
problem. You could write a script to run through the content table, and see
how many files are missing. That might influence how I would recommend
fixing it.
Ray
On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan nkerr@museumca.org
wrote:
Hi All,
The drive where we keep our cspace media (Nuxeo binaries) failed, we
recovered most of the data but are having an issue where various
derivatives of some blobs are missing. I am assuming that the files were
lost and not recovered when the drive was restored.
Is there some way to regenerate derivatives for existing blobs other
than removing and re-attaching the media via the ui? I couldn't find
anything in the documentation so thought I'd put it to the group.
Thanks,
Nathan
--
Nathan Kerr
nkerr@museumca.org
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists
.collectionspace.org
--
Nathan Kerr
Digital Resource Program Manager
Oakland Museum of California
1000 Oak St
Oakland CA, 94607
510-318-8494 <(510)%20318-8494>
nkerr@museumca.org
Let's move this conversation to Nathan's post on the questions forum:
https://wiki.collectionspace.org/questions/165937722/some-blobs-missing-medium-derivatives
I made a comment there.
Thanks,
Ray
On Tue, May 23, 2017 at 2:44 PM, Ray Lee <rhlee@berkeley.edu> wrote:
> Once you know the names of your missing files (the long hashes from the
> view table), check the convertcache to see if they're still there.
> Typically this is in nuxeo-server/data/convertcache in your tomcat
> directory. This cache is supposed to be cleared out periodically, but in
> older versions of CSpace it was allowed to grow indefinitely (CSPACE-6711
> <https://issues.collectionspace.org/browse/CSPACE-6711>), which would
> actually be useful in this situation. The files in convertcache are hashed
> into weird directory names, but you can do a find.
>
> Ray
>
>
>
> On Tue, May 23, 2017 at 11:28 AM, Kerr, Nathan <nkerr@museumca.org> wrote:
>
>> Thanks John and Ray.
>>
>> Ray - I know some files are missing because I have found at least one
>> missing dir in the filesystem (getting the hash from the error logs), but
>> was unsure how to actually determine the scope of the issue so this is
>> really helpful!
>>
>>
>>
>> On Mon, May 22, 2017 at 9:43 PM, Ray Lee <rhlee@berkeley.edu> wrote:
>>
>>> Hi Nathan,
>>> First I'd want to verify that the files are missing. Maybe they just
>>> have incorrect permissions, or your problem is actually in the database.
>>>
>>> First find the filename of a blob that exhibits the problem (in this
>>> example, page1.png). Find the repositoryid in blobs_common:
>>>
>>> nuxeo_default=# select repositoryid from blobs_common where name =
>>> 'page1.png';
>>> repositoryid
>>> --------------------------------------
>>> b76736a8-d212-49a7-a08f-8c52bcaf4e20
>>> (1 row)
>>>
>>> This should return one row. Find the derivatives, using the repositoryid:
>>>
>>> nuxeo_default=# select view.* from hierarchy h inner join view on
>>> view.id = h.id where h.parentid = 'b76736a8-d212-49a7-a08f-8c52b
>>> caf4e20';
>>> id | filename | width |
>>> description | tag | title | height
>>> --------------------------------------+---------------------
>>> ---+-------+---------------------+-----+--------------+--------
>>> 858e8ad6-fd20-4928-a896-b86c17653d02 | Medium_page1.jpg | 550
>>> | Medium size | | Medium | 550
>>> 4758b4b0-822a-46a6-ae89-93d5c292ddd0 | Thumbnail_page1.jpg | 100
>>> | Thumbnail size | | Thumbnail | 100
>>> 59f7c8e7-6717-4a43-980d-8264f1d24bf1 | Small_page1.jpg | 280
>>> | Small size | | Small | 280
>>> afb2493d-300b-4a48-9701-0e1af9115486 | OriginalJpeg_page1.jpg | 600
>>> | Original jpeg image | | OriginalJpeg | 600
>>> (4 rows)
>>>
>>> There should be a medium derivative. Find the content, using its id:
>>>
>>> nuxeo_default=# select content.* from hierarchy h inner join content on
>>> content.id = h.id where h.parentid = '858e8ad6-fd20-4928-a896-b86c1
>>> 7653d02';
>>> id | mime-type | data
>>> | name | length | digest
>>> | encoding
>>> --------------------------------------+------------+--------
>>> --------------------------+------------------+--------+-----
>>> -----------------------------+----------
>>> ce9cc335-c225-4a3a-b271-50929364f969 | image/jpeg |
>>> a55dc8a033eece421a0d2bf120960a91 | Medium_page1.jpg | 6915 |
>>> a55dc8a033eece421a0d2bf120960a91 |
>>> (1 row)
>>>
>>> The data column contains the name of the file that should exist in the
>>> filesystem. In this example, I would expect to find the file
>>> a5/5d/a55dc8a033eece421a0d2bf120960a91 in the nuxeo binary data store
>>> (note the two-levels of directory hashing, using the first four digits of
>>> the hash).
>>>
>>> If the file is missing, I would next try to determine the scope of the
>>> problem. You could write a script to run through the content table, and see
>>> how many files are missing. That might influence how I would recommend
>>> fixing it.
>>>
>>> Ray
>>>
>>> On Mon, May 22, 2017 at 12:50 PM, Kerr, Nathan <nkerr@museumca.org>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> The drive where we keep our cspace media (Nuxeo binaries) failed, we
>>>> recovered most of the data but are having an issue where various
>>>> derivatives of some blobs are missing. I am assuming that the files were
>>>> lost and not recovered when the drive was restored.
>>>>
>>>> Is there some way to regenerate derivatives for existing blobs other
>>>> than removing and re-attaching the media via the ui? I couldn't find
>>>> anything in the documentation so thought I'd put it to the group.
>>>>
>>>> Thanks,
>>>> Nathan
>>>>
>>>> --
>>>> *Nathan Kerr*
>>>> nkerr@museumca.org
>>>>
>>>>
>>>> _______________________________________________
>>>> Talk mailing list
>>>> Talk@lists.collectionspace.org
>>>> http://lists.collectionspace.org/mailman/listinfo/talk_lists
>>>> .collectionspace.org
>>>>
>>>>
>>>
>>
>>
>> --
>> *Nathan Kerr*
>> Digital Resource Program Manager
>> Oakland Museum of California
>> 1000 Oak St
>> Oakland CA, 94607
>> 510-318-8494 <(510)%20318-8494>
>> nkerr@museumca.org
>>
>>
>