An R-based analysis investigating whether word structure (syllable length and initial segment type) influences acceptability judgments among native Brazilian Portuguese speakers, using a nonce-word forced-choice task.
- Project Overview
- Hypotheses
- Data
- Analysis & Outputs
- Prerequisites
- Installation
- Usage
- Script Breakdown
- Interpretation of Results
- Extending & Customizing
- Data Source
- License
We analyze acceptability judgments for nonce words varying by:
- Length: number of syllables (e.g., monosyllabic vs. disyllabic)
- Initial segment: type of onset (e.g., stop vs. fricative)
Responses are binary (accept
vs. reject
). We visualize the contingency with mosaic plots and test for associations using chi-square tests.
- Null hypothesis (Hβ): Word structure (length & initial segment) does not influence acceptability judgments.
- Alternative hypothesis (Hβ): Word structure does influence acceptability judgments.
- File:
bp-nonce.csv
- Columns:
response
:"accept"
or"reject"
length
: e.g."1"
(monosyllabic) or"2"
(disyllabic)initial
: initial segment category, e.g."stop"
,"fricative"
, etc.
- Contingency table of
response
Γlength
Γinitial
. - Mosaic plot faceted by
length
andinitial
. - Chi-square test of
response
vs.initial
.- p β 0.59 β fail to reject Hβ at Ξ± = 0.05.
- Chi-square test of
length
vs.initial
.- p = 1.00 β no association (as expected).
- R β₯ 4.0
- RStudio (optional)
- Internet access (to install any missing packages)
ggplot2
ggmosaic
The script will install missing packages automatically.
-
Clone the repository:
git clone https://github.com/yourusername/bp-nonce-acceptability.git cd bp-nonce-acceptability
-
Place
bp-nonce.csv
in the project root.
Run the analysis script:
# In R or RStudio, set working directory and source:
setwd("path/to/bp-nonce-acceptability")
source("bp_nonce_analysis.R")
This will:
- Print the contingency tables
- Display the mosaic plot
- Print chi-square test results
-
Load libraries (
ggplot2
,ggmosaic
) -
Read data from
bp-nonce.csv
-
Summarize with
xtabs(~ response + length + initial)
-
Mosaic plot:
ggplot(bp) + geom_mosaic(aes(weight=1, x=product(response), fill=response)) + facet_grid(length ~ initial) + β¦
-
Chi-square tests:
chisq.test(xtabs(~ response + initial, data=bp))
chisq.test(xtabs(~ length + initial, data=bp))
- Response vs. Initial: p = 0.59 β no significant association β fail to reject Hβ.
- Length vs. Initial: p = 1.00 β variables are independent (by design).
Conclusion: No evidence that initial segment type influences acceptability. Word structure does not appear to affect judgments in this dataset.
- Test
response
vs.length
directly. - Add other predictors: stress pattern, coda type, etc.
- Use logistic regression (
glm(response ~ length + initial, family="binomial")
). - Refine visualization colors, labels, or facet layouts.
- Brazilian Portuguese nonce-word experiment: collected via [describe methodology or link here].
- Dataset:
bp-nonce.csv
(private dataset, see project repository).
This project is licensed under the MIT License. See LICENSE for details.