CALLHOME Relationship Labels
Interpersonal relationship labels for the CALLHOME English corpus

Project Details
- Project Type : Research Dataset
- Domain : Conversational Analysis
- Technologies : Corpus Annotation, CSV
- GitHub : https://github.com/dkaterenchuk/callhome_labels
- Stars : 2
- License : GPL-3.0
CALLHOME Relationship Labels
A curated dataset of manually annotated interpersonal relationship labels for the CALLHOME English corpus, enabling research in conversational analysis and sociolinguistics.
Overview
The CALLHOME English corpus is a widely-used dataset of telephone conversations between native English speakers. This project provides detailed relationship annotations between conversation participants, offering valuable metadata for linguistic research, social network analysis, and conversational dynamics studies.
Dataset Contents
The repository includes four primary datasets in CSV format:
- PRIMARY labels - Binary classification marking relationships as “FRIEND” or “RELATIVE”
- SECONDARY labels - Additional contextual information supplementing primary designations
- NOTES - Annotator observations and metadata from the labeling process
- FULL_LIST - Consolidated dataset combining all three categories above
Research Applications
This labeled dataset enables research in several areas:
- Conversational analysis - Study how relationship types affect communication patterns
- Sociolinguistics - Analyze language variation based on interpersonal relationships
- Discourse analysis - Examine turn-taking, topic selection, and conversational strategies
- Social network analysis - Map relationship structures in conversation data
- Computational linguistics - Train models for relationship detection and classification
Resources
- GitHub Repository: callhome_labels
- Research Paper: Included in repository (1081.pdf) detailing annotation methodology and findings
- License: GPL-3.0
About the CALLHOME Corpus
The CALLHOME English corpus consists of unscripted telephone conversations between friends and family members. These relationship labels add an important dimension to the corpus, allowing researchers to study how interpersonal dynamics shape conversational behavior.
This dataset contributes to the advancement of research in conversational analysis, sociolinguistics, and computational approaches to understanding human communication.