[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82684":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":8,"language":10,"languages":8,"totalLinesOfCode":8,"stars":11,"forks":12,"watchers":12,"openIssues":13,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":14,"stars7d":15,"stars30d":16,"stars90d":13,"forks30d":13,"starsTrendScore":17,"compositeScore":18,"rankGlobal":8,"rankLanguage":8,"license":8,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":21,"topics":22,"createdAt":8,"pushedAt":8,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":13,"starSnapshotCount":13,"syncStatus":26,"lastSyncTime":27,"discoverSource":28},82684,"I-found-a-seashell-in-the-middle-of-the-desert","Hawzen\u002FI-found-a-seashell-in-the-middle-of-the-desert","Hawzen",null,"http:\u002F\u002Fshell.hawzen.me\u002F","Jupyter Notebook",176,1,0,4,6,53,12,51.2,false,"main",true,[],"2026-06-12 04:01:38","# I found a seashell in the middle of the desert\n\nTo my amazement, I found a fully solid rock that eerily resembles a seashell at the base of a cliff in the Alghat desert, Saudi Arabia. I didn't know what to make of it at first, it had the swirls and shape of a seashell but was fully a rock, more importantly, it *shouldn't* be here; the nearest coastline is Dammam's, 500 km away.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fmedia\u002Fancient_fossil.jpeg\" alt=\"Ancient fossil\" width=\"45%\"> \u003Cimg src=\".\u002Fmedia\u002Ffossil_location_26.07208_N_44.96803_E.jpeg\" alt=\"Fossil location\" width=\"45%\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cem>This looks impossible\u003C\u002Fem>\n\u003C\u002Fp>\n\nCarbonate rocks (e.g. limestone), marine fossils, coral fossils, and sedimentary structures (like ripples or bioturbation) all exist in and around Alghat, which points to the fact that parts of the Arabian Peninsula were once submerged under the sea. Specifically in the late Jurassic age (~150 million years ago)[1].\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fmedia\u002Fmarine_macroinvertebrate_lower_hanifa_figure_2_stratigraphy.png\" alt=\"Fossil location\" width=\"80%\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cem>Stratigraphic distribution figure of areas near Najd[1]\u003C\u002Fem>\n\u003C\u002Fp>\n\nNevertheless, I was still super curious about the fossil I found; what animal inhabited it? what did it look like back in the Jurassic age? any modern relatives or lookalikes?\n\nThe proper way of answering these questions is to conduct a detailed analysis of the fossil (e.g. via inspecting the sediment it was found in, its shape, etc.), this should be done by an expert paleontologist. However, I know no paleontology, or any paleontologist, so I figured I could DIY it myself (how hard could it be..?), though I'll do it strictly via its shape — or what's called its *morphology*. Morphology alone is probably not accurate enough to discern lineage as different species might lookalike but are from different lineages, so this is probably not the best way to do it, but it sounded fun and intuitive, so I gave it a try.\n\nConcretely, I plan on:\n1. Mathematically representing the shape of a shell\n2. Defining a distance metric between shapes (so that I can find shells similar to the fossil's)\n3. Mapping out the *space* of shapes\n\n7894 different species and 59244 images of shells were in the Zhang, et al. shell dataset[2]; good enough for me!\n\nCapturing 'shape' is actually a very hard problem; any object can be rotated by [pitch, yaw, roll](https:\u002F\u002Fsimple.wikipedia.org\u002Fwiki\u002FPitch,_yaw,_and_roll), scaled, and translated. Before starting any statistical analysis, I followed a guideline to isolate the shape from other factors\n1. The shell must be centered to the midpoint of the picture\n2. The scale of the shell must be equivalent across all images (specifically, the maximum distance from the origin is 1)\n3. Orientation is the hardest part\n    - Pitch and yaw can be fixed by only choosing samples where the shell's opening is facing the camera. This is not perfect, but I found the dataset to be pretty consistent with its angles\n    - Roll is difficult. A shell can be rotated in any way around the axis (even whilst the opening is facing the camera). My *fix* was to use the longest radius as the reference point, and rotate the shell so that the longest radius is always on the right. This is not perfect either, but it was good enough for me.\n\nThen, I extracted the contour of the shell to 256 points relative to the center. This way, each shell is represented by a 256x2 matrix, where each row is the (x, y) coordinates of a point on the contour. Example:\n\n```python\n> contours[0].shape\n\n(256, 2)\n\n> contours[0].tolist()[:5]\n\n[-0.38561132550239563, 0.9804982542991638],\n [-0.4204626679420471, 0.9785506725311279],\n [-0.4553140103816986, 0.976603090763092],\n [-0.4901654124259949, 0.9746555089950562],\n [-0.5230183005332947, 0.9685550928115845]]\n```\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fmedia\u002Fexperimentation_data\u002Fshells.png\" alt=\"Ancient fossil\" width=\"70%\"> \u003C\u002Fbr>\n  \u003Cimg src=\".\u002Fmedia\u002Fexperimentation_data\u002Fnormalized.png\" alt=\"Fossil location\" width=\"70%\">\n  \u003Cimg src=\".\u002Fmedia\u002Fexperimentation_data\u002Fcontours.png\" alt=\"Fossil location\" width=\"70%\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cem>Normalization pipeline\u003C\u002Fem>\n\u003C\u002Fp>\n\nNaturally, the distance between two shells s1 and s2 is [squared euclidean distance](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FEuclidean_distance) between their contour points:\n\n$$\nd(s1, s2) = {\\sum_{256} (s1.x_i - s2.x_i)^2 + (s1.y_i - s2.y_i)^2}\n$$\n\nRepresenting the space will require 256 dimensions, which is a little more than just the 2 I need to plot it over x and y. Given the normalized shell contour above, it's clear that many of these dimensions are redundant (for instance, the space of all possible 256 contour points allows intersection, while the space of possible shells doesn't, AFAIK), so the space of possible shells can be condensed into a smaller *[latent space](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FLatent_space)*. To drive my point home, I'll show three examples of fully random contours (i.e. pseudo-random points around the origin).\n\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fmedia\u002Fexperimentation_data\u002Frandom_contours.png\" alt=\"Ancient fossil\" width=\"70%\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cem>Probably not a real shell\u003C\u002Fem>\n\u003C\u002Fp>\n\nDimensionality reduction techniques map the original 256 dimensions onto a smaller number of dimensions (e.g. 2 or 3) while trying to preserve the distance between shells as much as possible. One such technique I'll be using is [Principal Component Analysis (PCA)](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FPrincipal_component_analysis). Here's an excellent fragment that explains how PCA works: https:\u002F\u002Fstats.stackexchange.com\u002Fquestions\u002F2691\u002Fmaking-sense-of-principal-component-analysis-eigenvectors-eigenvalues\u002F140579#140579.\n\nAfter applying PCA, I retained 56.50% of the variance using only the first principal component (PC1), and 67.25% using the first two. This means we can describe a shell's shape by only two numbers, and be pretty close to the original shape!\n\nThe interesting part is trying to understand what these two numbers mean; dimension 1 in the original 256-dimensional space annotates the location of the first contour point of the shell, whereas dimension 1 of the latent space annotates a high-level feature, learned by the PCA algorithm. We can visually try to understand what PCA dimension PC1 represents by finding two shells, diametrically opposite in the PC1 dimension, yet similar in all other dimensions. \n\nEssentially, we want to find two shells i and j such that the following score is maximized:\n\n$$\n\\text{score}(i,j) =\n\\frac{|z_{i,1} - z_{j,1}|}\n{\\|\\mathbf{z}_{i,2:k} - \\mathbf{z}_{j,2:k}\\|_2}\n$$\n\nPC1 seems to capture the 'pointiness' of the shell, i.e. more than 50% of variance in shell shapes can be explained by how pointy they are. PC2 seems to capture the symmetry of the shell, or perhaps the mass distribution over the vertical axis. I'll leave the interpretation of the other dimensions as an exercise for the reader (I have no idea).\n\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fmedia\u002FPCA.png\" alt=\"PCA\" width=\"80%\">\n\u003C\u002Fp>\n\nAnd now for the grand finale, we can plot the shells in the latent space, and see where our Alghat fossil fits in it. But first, for dramatic tension, I will discuss the plot.\n\nThe plot represents PC1 on the x-axis and PC2 on the y-axis, while color represents the roughness of a shell (computed as the difference in slope between consecutive points). The following observations are worth noting:\n\n1.  Negative PC1 values (representing roundness) are way more common than positive PC1 values (representing pointiness). Yet roundness is less diverse and occupies less space than pointy shells\n2. Pointy shells seem to be way more rough than round shells\n3. Negative PC1 values always have PC2 values close to zero; no shell in the dataset has a round but asymmetric shape. Below, I will project those shells back from latent space to the shape space, imagining *impossible* shells\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fmedia\u002Fmap.png\" alt=\"map\" width=\"100%\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cem>Map of shell latent space with example shells\u003C\u002Fem>\n\u003C\u002Fp>\n\n\n\u003Cp align=\"center\">\n  \u003Ca href=\".\u002Fmedia\u002FPC1.mp4\">\u003Cimg src=\".\u002Fmedia\u002FPC1.gif\" alt=\"PC1 animation\" width=\"45%\">\u003C\u002Fa>\n  \u003Ca href=\".\u002Fmedia\u002FPC2.mp4\">\u003Cimg src=\".\u002Fmedia\u002FPC2.gif\" alt=\"PC2 animation\" width=\"45%\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cem>Modifying Principal Components against the mean shell\u003C\u002Fem>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\".\u002Fmedia\u002Fimpossible_shapes.mp4\">\u003Cimg src=\".\u002Fmedia\u002Fimpossible_shapes.gif\" alt=\"Impossible shell projections\" width=\"70%\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cem>Projecting 'impossible' shells\u003C\u002Fem>\n\u003C\u002Fp>\n\nSo, what shell most closely resembles our Alghat fossil? It's Sphincterochila candidissima (try to pronounce it). However, it is really young, nowhere near the Jurassic age; instead, the earliest fossil of it dates back 38 million years ago[4].  Ultimately, shape is not the best way of determining shell lineage, but its eerie similarity to the Alghat fossil is still fascinating, and perhaps points to some sort of convergent evolution, where two different species evolve to have similar shapes due to similar environmental pressures.\n\n\u003Cp align=\"center\" justify=\"top\">\n  \u003Cimg src=\".\u002Fmedia\u002Fcomparison.jpeg\" alt=\"closest\" width=\"45%\"> \n  \u003Cimg src=\"https:\u002F\u002Fupload.wikimedia.org\u002Fwikipedia\u002Fcommons\u002F8\u002F80\u002FSphincterochila_candidissima_viva_01.jpg\" alt=\"closest\" width=\"45%\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cem>Left: Alghat fossil compared, Right: Sphincterochila candidissima[3]\u003C\u002Fem>\n\u003C\u002Fp>\n\n## Explore the tool\nFeel free to explore the tool and try to figure out where a shell of your choice fits in the shell latent space!\n\nhttps:\u002F\u002Fshell.hawzen.me\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fmedia\u002Fdemo.png\" >\n\u003C\u002Fp>\n\n\n\n## References\n1. Aba Alkhayl, S. S. (2022). Marine macro-invertebrate fossils from the Lower Hanifa Formation (Hawtah Member), central Saudi Arabia. Arabian Journal of Geosciences, 15, 1410. https:\u002F\u002Fdoi.org\u002F10.1007\u002Fs12517-022-10581-w\n2. Zhang, Q., Zhou, J., He, J. et al. A shell dataset, for shell features extraction and recognition. Sci Data 6, 226 (2019). https:\u002F\u002Fdoi.org\u002F10.1038\u002Fs41597-019-0230-3\n3. https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FSphincterochila_candidissima\n4. Tracey, S., Todd, J. A., & Erwin, D. H. (1993). Mollusca: Gastropoda. In M. J. Benton (Ed.), The Fossil Record 2 (pp. 131–167). London: Chapman &\n","该项目通过数学方法分析了一个在沙特阿拉伯沙漠中发现的海贝壳化石。核心功能包括使用形态学方法来表示贝壳形状、定义形状间的距离度量以及绘制形状空间，以寻找与该化石相似的现代贝壳。项目基于Jupyter Notebook编写，利用了包含7894种不同物种和59244张贝壳图像的数据集进行分析。适合对古生物学、地质学或自然历史感兴趣的个人或研究者使用，特别是那些希望通过形态学方法探索化石起源的人士。",2,"2026-06-11 04:08:56","CREATED_QUERY"]