---
title: "Getting started"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting started}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## **What is aesopR?**

aesopR provides a tidy, public-domain corpus of *Aesop’s Fables* sourced from the Library of Congress.

It is designed for teaching, exploration, and reproducible text analysis workflows—without relying on copyrighted text.

The package ships with:

-   aesops_fables: one row per fable (text + metadata)

-   aesops_tokens: one row per word token (analysis-ready)

-   aesops_bing: tokens joined with Bing sentiment labels

-   aesops_afinn: tokens joined with AFINN sentiment scores

In this vignette, we’ll use one fable as a running example: **“The Fox and the Grapes.”**

```         
library(aesopR)
```

## **The core datasets**

### **aesops_fables**

### **full texts and metadata**

aesops_fables contains the fables as complete narratives, along with a moral and a source URL.

```         
aesops_fables
```

### **aesops_tokens**

### **tidy word-level data**

aesops_tokens is derived from the fable texts and is ready for word frequency, n-grams, and sentiment workflows.

```         
aesops_tokens
```

## **Find “The Fox and the Grapes”**

```
fox_text <- aesops_fables |> filter(fable_id == "005")
fox_text
```

## **Token-based exploration**

Once you have a fable_id, you can pull its word tokens.

```         
fox_tokens <-  aesops_tokens |> filter(fable_id == "005")
head(fox_tokens)
```

### **Most common words in this fable**

```         
fox_tokens |> count(word, sort = TRUE)
```

This is intentionally simple. In a methods class, this is a great moment to discuss:

-   stop words

-   lemmatization/stemming

-   how preprocessing choices influence results

## **Sentiment datasets (no external downloads)**

To avoid interactive prompts and downloads during checks, aesopR ships with two sentiment-joined datasets:

-   aesops_bing (positive/negative labels)

-   aesops_afinn (numeric sentiment scores)

### **Bing sentiment: positive vs negative counts**

```         
fox_bing <- aesops_bing |> filter(fable_id == "005")
table(fox_bing$sentiment)
```

### **AFINN sentiment: summary of scores**

```         
fox_afinn <- aesops_afinn |> filter(fable_id == "005")
summary(fox_afinn$value)
```

A nice discussion prompt:

-   Do the sentiment signals match the *moral* of the story?

-   What words are driving the sentiment?

-   How might results change with a different lexicon?

## **Where to go next**

If you’re teaching or learning research methods, aesopR works well for:

-   operational definitions (what counts as a “theme”?)

-   measurement decisions (lexicon choice, preprocessing choices)

-   replication and transparency (same corpus, different pipelines)

-   quick classroom activities (frequency, sentiment, moral inference)