Data Loader
evals_hub.data_loader.retrieval
BioASQLoader
Loader for the BioASQ 12b dataset, following the style of TextEmbedLoader. Returns queries, documents, and relevances as HuggingFace Datasets.
Source code in evals_hub/data_loader/retrieval/bioasq.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 | |
NFCorpusLoader
Source code in evals_hub/data_loader/retrieval/nfcorpus.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | |
__init__(split=None, ssl_verify=True)
Initialize the NFCorpusLoader with metadata.
Source code in evals_hub/data_loader/retrieval/nfcorpus.py
13 14 15 16 17 18 19 20 21 22 23 | |
load_data()
Load the NFCorpus dataset.
Source code in evals_hub/data_loader/retrieval/nfcorpus.py
25 26 27 28 29 30 31 32 | |
evals_hub.data_loader.reranking
AlloprofLoader
Source code in evals_hub/data_loader/reranking/alloprof.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | |
__init__(split=None, ssl_verify=True)
Initialize the AlloprofLoader with metadata.
Source code in evals_hub/data_loader/reranking/alloprof.py
12 13 14 15 16 17 18 19 20 21 22 | |
load_data()
Load the Alloprof dataset.
Source code in evals_hub/data_loader/reranking/alloprof.py
24 25 26 27 28 29 | |
evals_hub.data_loader.classification
ClassificationLoader
Bases: Protocol
Protocol for a classification loader. Requires the fields query and label.
Source code in evals_hub/data_loader/classification/protocol.py
6 7 8 9 10 11 | |
AmazonCounterFactualLoader
A loader for the Amazon Counterfactual dataset - label: int — 0 (not-counterfactual) or 1 (counterfactual)
Source code in evals_hub/data_loader/classification/amazon_counterfactual.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | |
load_data()
Load the AmazonCounterFactual dataset.
Source code in evals_hub/data_loader/classification/amazon_counterfactual.py
29 30 31 32 33 34 35 36 | |
evals_hub.data_loader.nli
NLILoader
Bases: Protocol
Protocol for an NLI dataset loader. Requires the fields hypothesis, premise, and label
Source code in evals_hub/data_loader/nli/protocol.py
6 7 8 9 10 11 12 13 14 15 16 17 18 | |
SciFactNLILoader
Load SciFact entailment dataset. Returns hypotheses, premises and labels as HuggingFace datasets.
Source code in evals_hub/data_loader/nli/scifact.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | |
__init__(split=None, seed=None, hf_subset=None)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
split
|
str
|
The split to evaluate |
None
|
seed
|
int
|
A seed to use for reproducibility |
None
|
hf_subset
|
str
|
The language to evaluate |
None
|
Source code in evals_hub/data_loader/nli/scifact.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | |
XNLILoader
Load the MTEB XNLI dataset. Returns hypotheses, premises and labels.
Source code in evals_hub/data_loader/nli/xnli.py
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | |
__init__(split=None, hf_subset=None)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
The path to the dataset |
required |
split
|
str
|
The split of the dataset to use. Defaults to "test". |
None
|
hf_subset
|
str
|
The HuggingFace subset to use. Defaults to "en". |
None
|
Source code in evals_hub/data_loader/nli/xnli.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | |