Saturday, May 10, 2025

Clean Orphaned Blob Records

Today we explore some of the challenges with ignoring maintenance of your Sitecore installation. 


Recently I found over 60% of the Sitecore master/web databases were consumed by orphaned blob records. I believe this behavior is experienced when detaching data from an existing media item or deleting the media item. The Blobs table contains this orphaned record until another process is triggered to purge the data. There have been several prior articles discussing this issue so I'm simply going to outline what I discovered along the way.

The part you care about

Database details prior to cleanup

  • We ran this on Sitecore XM 10.2 and 10.4
  • Consult with Sitecore support first if you are having a similar issue
  • The MDF file was approaching 60GB with over 90% used space
  • Total blob records in the table - 127,300
  • Total unused blob records - 104,466
    • 59,353 distinct BlobId values
Database details after cleanup
  • The MDF file now shows used space at 8% (4.7GB)
  • Total blob records in the table - 22,867
Total run time : 7 hours, 2 minutes, 23 seconds

The following script was used against a Sitecore XM 10.2 database. Proceed with caution.

/* Modified version of the cleanup script provided by Sitecore Support. This version runs with a counter to help run in small batches. */
create table #UnusedBlobIDs (
ID UNIQUEIDENTIFIER PRIMARY KEY (ID)
);
WITH [ExistingBlobs] ([BlobId]) AS (
SELECT [Blobs].[BlobId]
FROM [Blobs]
JOIN [SharedFields]
ON '{' + CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) + '}' = [SharedFields].[Value]
UNION
SELECT [Blobs].[BlobId]
FROM [Blobs]
JOIN [SharedFields]
ON CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) = [SharedFields].[Value]
UNION
SELECT [Blobs].[BlobId]
FROM [Blobs]
JOIN [VersionedFields]
ON '{' + CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) + '}' = [VersionedFields].[Value]
UNION
SELECT [Blobs].[BlobId]
FROM [Blobs]
JOIN [VersionedFields]
ON CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) = [VersionedFields].[Value]
UNION
SELECT [Blobs].[BlobId]
FROM [Blobs]
JOIN [UnversionedFields]
ON '{' + CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) + '}' = [UnversionedFields].[Value]
UNION
SELECT [Blobs].[BlobId]
FROM [Blobs]
JOIN [UnversionedFields]
ON CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) = [UnversionedFields].[Value]
UNION
SELECT [Blobs].[BlobId]
FROM [Blobs]
JOIN [ArchivedFields]
ON '{' + CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) + '}' = [ArchivedFields].[Value]
UNION
SELECT [Blobs].[BlobId]
FROM [Blobs]
JOIN [ArchivedFields]
ON CONVERT(NVARCHAR(MAX), [Blobs].[BlobId]) = [ArchivedFields].[Value]
)
INSERT INTO #UnusedBlobIDs (ID) SELECT DISTINCT [Blobs].[BlobId]
FROM [Blobs]
WHERE NOT EXISTS
(
SELECT NULL
FROM [ExistingBlobs]
WHERE [ExistingBlobs].[BlobId] = [Blobs].[BlobId]
)
DECLARE @r INT;
DECLARE @batchsize INT;
DECLARE @counter INT;
SET @r = 1;
SET @batchsize=1000;
SET @counter = 0
WHILE (@r > 0 AND @counter < 10000)
BEGIN
BEGIN TRANSACTION;
DELETE TOP (@batchsize) FROM [Blobs] where [Blobs].[BlobId] IN (SELECT ID from #UnusedBlobIDs);
SET @r = @@ROWCOUNT;
SET @counter = @counter + @r
COMMIT TRANSACTION;
END
DROP TABLE #UnusedBlobIDs;

One script that I found particular helpful was one that helped me periodically check the row count of the Blobs table.

/* Count all the rows in the Blobs table. Considerably faster than SELECT COUNT(1) FROM [Blobs] */
SELECT
Total_Rows= SUM(st.row_count)
FROM
sys.dm_db_partition_stats st
WHERE
object_name(object_id) = 'Blobs' AND (index_id < 2)