Each scan gets a deterministic overall fingerprint (SHA-256 over sorted path+fileHash pairs) plus per-file SHA-256 hashes and stored text content (binary: hash+size only). On upload the skill is always re-scanned and classified vs prior scans as new / identical / modified, with a per-fingerprint check counter, a "most similar known skill" link, and a file-level diff view. Deviations from the plan: - Relation matching keys off shared file *paths* (Jaccard over paths, tie-break on hashes), not hash-Jaccard alone, which is always 0 for single-file edits (text paste = one SKILL.md) and would mis-class every edited single-file skill as "new". Similarity is content-aware: identical files = 1.0, changed text files use line-level LCS ratio, added/removed/changed-binary = 0. - parseText no longer uses the display name as the file path (fixed "SKILL.md") so identical pastes with different names are "identical", not "modified". Backend: skillFingerprint.ts, lineDiff.ts (+lineSimilarity), skillParser.ts (per-file hash+isBinary), routes/scans.ts (computeRelation, content similarity, checkCount, comparedScan, GET /scans/:id/compare/:otherId). DB: scans fingerprint/relation/similarity/comparedScanId (+index), scan_files hash/content. API spec + orval codegen regenerated. UI: fingerprint card + compare link on report, relation badges in history, new /vergleich/:id/:otherId page with side-by-side summaries and expandable line diff. German UI, no emojis. Verified end-to-end against the running API and screenshotted both UI pages; test data cleaned up afterward. Code-review fix: relation classification no longer relies on path-Jaccard (every text paste shares path SKILL.md, so unrelated pastes were falsely linked as "modified"). computeRelation now selects the candidate by content-aware similarity and only returns "modified" when similarity >= 40 or a file is byte-identical; otherwise "new". Updated OpenAPI similarity description; removed now-unused jaccard import. Replit-Task-Id: 79a8e472-6635-493c-8995-3233ba7df75c
18 lines
644 B
TypeScript
18 lines
644 B
TypeScript
import { pgTable, serial, text, integer } from "drizzle-orm/pg-core";
|
|
import { scansTable } from "./scans";
|
|
|
|
export const scanFilesTable = pgTable("scan_files", {
|
|
id: serial("id").primaryKey(),
|
|
scanId: integer("scan_id")
|
|
.notNull()
|
|
.references(() => scansTable.id, { onDelete: "cascade" }),
|
|
path: text("path").notNull(),
|
|
kind: text("kind").notNull(),
|
|
language: text("language"),
|
|
size: integer("size").notNull().default(0),
|
|
hash: text("hash").notNull().default(""),
|
|
content: text("content"),
|
|
});
|
|
|
|
export type ScanFile = typeof scanFilesTable.$inferSelect;
|
|
export type InsertScanFile = typeof scanFilesTable.$inferInsert;
|