i've been wanting to explore the kind of things i could build on top of atproto for awhile, and i've got a lil time between gigs, i'm gonna explore!

the first project i wanted to try was a....

Book Tracker

i try to read 100 books every year 😰 and needed some inspiration so i tried to build a little website that shows me what people are reading.

i'm connecting to the jetstream firehose to look for instances of people saying "reading now" or "i'm reading" and attempt to turn that into a "mention".

if you're interested here is my firehose client.

interface BookMention {
  title: string;
  author?: string;
  confidence: number;
  postUri: string;
  postAuthor: string;
  postText: string;
  timestamp: string;
}

which I then try to find on open library and turn into an "Enriched Book"

interface EnrichedBook {
  id: string;
  title: string;
  author?: string;
  coverUrl?: string;
  description?: string;
  publishYear?: number;
  pageCount?: number;
  subjects?: string[];
  openLibraryUrl?: string;
  isbn?: string;
  bookshopUrl?: string;
  mentionCount: number;
  firstMentionedAt: string;
  lastMentionedAt: string;
  isActive: boolean;
  reviewNotes?: string;
}

i tried a lot of wacky stuff here to determine the name of a book in a post, and honestly most of them failed :(

export function validatePotentialTitle(
  title: string,
  fullText: string,
  pattern: RegExp
): ValidationResult {
  // Basic validation
  if (title.length < 5 || title.length > 100) {
    return {
      isValid: false,
      confidence: 0,
      reason: "Title length outside valid range"
    };
  }

  // Skip if title doesn't start with a capital letter (unless in quotes)
  if (
    !/^[A-Z]/.test(title) &&
    !pattern.toString().includes('"') &&
    !pattern.toString().includes("'")
  ) {
    return {
      isValid: false,
      confidence: 0,
      reason: "Title does not start with capital letter"
    };
  }

  // Skip if title contains too many numbers (likely not a book)
  if ((title.match(/\d/g) || []).length > 3) {
    return {
      isValid: false,
      confidence: 0,
      reason: "Title contains too many numbers"
    };
  }

  const titleLower = title.toLowerCase();

  // Check for grammar fragments that indicate not a title
  if (GRAMMAR_FRAGMENTS.some((pattern) => pattern.test(title))) {
    return {
      isValid: false,
      confidence: 0,
      reason: "Title contains grammar pattern indicating it's not a book title"
    };
  }

  // Skip common news sources
  if (COMMON_NEWS_SOURCES.some((source) => titleLower.includes(source))) {
    return {
      isValid: false,
      confidence: 0,
      reason: "Title contains news source name"
    };
  }

  // Skip titles with multiple news indicator words
  const newsWordCount = NEWS_INDICATOR_WORDS.filter((word) =>
    titleLower.includes(word)
  ).length;

  if (newsWordCount >= 2) {
    return {
      isValid: false,
      confidence: 0,
      reason: "Title contains multiple news indicator words"
    };
  }

  // Skip titles with multiple spam indicator words
  const spamWordCount = SPAM_INDICATOR_WORDS.filter((word) =>
    titleLower.includes(word)
  ).length;

  if (spamWordCount >= 2) {
    return {
      isValid: false,
      confidence: 0,
      reason: "Title contains multiple spam indicator words"
    };
  }

  // Check for false positive phrases
  if (FALSE_POSITIVE_PHRASES.some((phrase) => titleLower.startsWith(phrase))) {
    return {
      isValid: false,
      confidence: 0,
      reason: "Title starts with a common false positive phrase"
    };
  }

  // Check if the post contains patterns suggesting it's not actually about books
  if (UNRELATED_CONTENT_PATTERNS.some((pattern) => pattern.test(fullText))) {
    return {
      isValid: false,
      confidence: 0,
      reason: "Post contains language suggesting it's not about a book"
    };
  }

  // Skip if this looks like a subreddit/forum post format
  if (/subreddit:|Title:|URL:|Author:/i.test(fullText)) {
    return {
      isValid: false,
      confidence: 0,
      reason: "Text has subreddit/forum post format"
    };
  }

  // Skip titles that are just a few words without proper structure
  const words = title.split(/\s+/);
  if (words.length < 2) {
    return {
      isValid: false,
      confidence: 0,
      reason: "Title is too short (fewer than 2 words)"
    };
  }

  // If title starts with common sentence starter and has no special formatting, likely not a book
  if (
    COMMON_SENTENCE_STARTERS.includes(words[0].toLowerCase()) &&
    !pattern.toString().includes('"') &&
    !pattern.toString().includes("'") &&
    !pattern.toString().includes("book") &&
    !pattern.toString().includes("read")
  ) {
    return {
      isValid: true,
      confidence: 0.3, // Low confidence, might still be a book but needs more signals
      reason: "Title starts with common sentence starter"
    };
  }

  // If we got here, title passes basic validation
  return { isValid: true, confidence: 0.5 };
}

i wouldn't say that this was successful:

Connected to Bluesky firehose
🚫 Rejecting "my book" - Title does not start with capital letter
📕 Detected book: "t think I" (0.90)
📚 Added new book: "I Didn't Think I Could" (OL26834543W)
📕 Detected book: "Biggest Fans" (0.80)
📚 Added new book: "Biggest Fan" (OL39185109W)
📕 Detected book: "COVID Vaccine Costs Just Spiked for Millions Thanks to RFK Jr." (0.80)
📕 Detected book: "COVID Vaccine Costs Just Spiked for Millions Thanks to RFK Jr." (0.95)
📕 No Open Library data found for: "COVID Vaccine Costs Just Spiked for Millions Thanks to RFK Jr." (confidence: 0.80)
🔍 No metadata found for book mention: "COVID Vaccine Costs Just Spiked for Millions Thanks to RFK Jr." (0.80)
📕 No Open Library data found for: "COVID Vaccine Costs Just Spiked for Millions Thanks to RFK Jr." (confidence: 0.95)
🔍 No metadata found for book mention: "COVID Vaccine Costs Just Spiked for Millions Thanks to RFK Jr." (0.95)
📕 Detected book: "New spaces of emotions?  Locating crypto-Platonism for quantitative geographies" (0.90)
📕 No Open Library data found for: "New spaces of emotions?  Locating crypto-Platonism for quantitative geographies" (confidence: 0.90)
🔍 No metadata found for book mention: "New spaces of emotions?  Locating crypto-Platonism for quantitative geographies" (0.90)

but it gave me a look into what building with atproto was like, and how easy it was to grab from the firehose (i remember the day when that was just as easy to do on other social media sites 😪)

so now i'm playing around with the idea of folks hosting their own small networks of creative coding websites and those being tiny social networks that can post to bluesky, but have their own objects for things like projects. idk!