Running¶
Train¶
-
class
farm.train.
EarlyStopping
(head=0, metric='loss', save_dir=None, mode='min', patience=0, min_delta=0.001, min_evals=0)[source]¶ Bases:
object
Can be used to control early stopping with a Trainer class. Any object can be used instead which implements the method check_stopping and and provides the attribute save_dir
-
__init__
(head=0, metric='loss', save_dir=None, mode='min', patience=0, min_delta=0.001, min_evals=0)[source]¶ - Parameters
head – the prediction head referenced by the metric.
save_dir – the directory where to save the final best model, if None, no saving.
metric – name of dev set metric to monitor (default: loss) to get extracted from the 0th head or a function that extracts a value from the trainer dev evaluation result. NOTE: this is different from the metric to get specified for the processor which defines how to calculate one or more evaluation matric values from prediction/target sets, while this specifies the name of one particular such metric value or a method to calculate that value from the result returned from a processor metric.
mode – “min” or “max”
patience – how many evaluations to wait after the best evaluation to stop
min_delta – minimum difference to a previous best value to count as an improvement.
min_evals – minimum number of evaluations to wait before using eval value
-
check_stopping
(eval_result)[source]¶ Provide the evaluation value for the current evaluation. Returns true if stopping should occur. This will save the model, if necessary.
- Parameters
eval – the current evaluation result
- Returns
a tuple (stopprocessing, savemodel, evalvalue) indicating if processing should be stopped and if the current model should get saved and the evaluation value used.
-
-
class
farm.train.
Trainer
(model, optimizer, data_silo, epochs, n_gpu, device, lr_schedule=None, evaluate_every=100, eval_report=True, use_amp=None, grad_acc_steps=1, local_rank=-1, early_stopping=None, log_learning_rate=False, log_loss_every=10, checkpoint_on_sigterm=False, checkpoint_every=None, checkpoint_root_dir=None, checkpoints_to_keep=3, from_epoch=0, from_step=0, global_step=0, evaluator_test=True, disable_tqdm=False, max_grad_norm=1.0)[source]¶ Bases:
object
Handles the main model training procedure. This includes performing evaluation on the dev set at regular intervals during training as well as evaluation on the test set at the end of training.
-
__init__
(model, optimizer, data_silo, epochs, n_gpu, device, lr_schedule=None, evaluate_every=100, eval_report=True, use_amp=None, grad_acc_steps=1, local_rank=-1, early_stopping=None, log_learning_rate=False, log_loss_every=10, checkpoint_on_sigterm=False, checkpoint_every=None, checkpoint_root_dir=None, checkpoints_to_keep=3, from_epoch=0, from_step=0, global_step=0, evaluator_test=True, disable_tqdm=False, max_grad_norm=1.0)[source]¶ - Parameters
optimizer – An optimizer object that determines the learning strategy to be used during training
data_silo (DataSilo) – A DataSilo object that will contain the train, dev and test datasets as PyTorch DataLoaders
epochs (int) – How many times the training procedure will loop through the train dataset
n_gpu (int) – The number of gpus available for training and evaluation.
device – The device on which the train, dev and test tensors should be hosted. Choose from “cpu” and “cuda”.
lr_schedule – An optional scheduler object that can regulate the learning rate of the optimizer
evaluate_every (int) – Perform dev set evaluation after this many steps of training.
eval_report (bool) – If evaluate_every is not 0, specifies if an eval report should be generated when evaluating
use_amp (str) – Whether to use automatic mixed precision with Apex. One of the optimization levels must be chosen. “O1” is recommended in almost all cases.
grad_acc_steps (int) – Number of training steps for which the gradients should be accumulated. Useful to achieve larger effective batch sizes that would not fit in GPU memory.
local_rank (int) – Local rank of process when distributed training via DDP is used.
early_stopping (EarlyStopping) – an initialized EarlyStopping object to control early stopping and saving of best models.
log_learning_rate (bool) – Whether to log learning rate to Mlflow
log_loss_every (int) – Log current train loss after this many train steps.
checkpoint_on_sigterm (bool) – save a checkpoint for the Trainer when a SIGTERM signal is sent. The checkpoint can be used to resume training. It is useful in frameworks like AWS SageMaker with Spot instances where a SIGTERM notifies to save the training state and subsequently the instance is terminated.
checkpoint_every (int) – save a train checkpoint after this many steps of training.
checkpoint_root_dir (Path) – the Path of directory where all train checkpoints are saved. For each individual checkpoint, a subdirectory with the name epoch_{epoch_num}_step_{step_num} is created.
checkpoints_to_keep (int) – maximum number of train checkpoints to save.
from_epoch (int) – the epoch number to start the training from. In the case when training resumes from a saved checkpoint, it is used to fast-forward training to the last epoch in the checkpoint.
from_step (int) – the step number to start the training from. In the case when training resumes from a saved checkpoint, it is used to fast-forward training to the last step in the checkpoint.
global_step (int) – the global step number across the training epochs.
evaluator_test (bool) – whether to perform evaluation on the test set
disable_tqdm (bool) – Disable tqdm progress bar (helps to reduce verbosity in some environments)
max_grad_norm (float) – Max gradient norm for clipping, default 1.0, set to None to disable
-
train
()[source]¶ Perform the training procedure.
The training is visualized by a progress bar. It counts the epochs in a zero based manner. For example, when you specify
epochs=20
it starts to count from 0 to 19.If trainer evaluates the model with a test set the result of the evaluation is stored in
test_result
.- Returns
Returns the model after training. When you do
early_stopping
with asave_dir
the best model is loaded and returned.
-
classmethod
create_or_load_checkpoint
(data_silo, checkpoint_root_dir, model, optimizer, local_rank=-1, resume_from_checkpoint='latest', **kwargs)[source]¶ Try loading a saved Trainer checkpoint. If no checkpoint found, it creates a new instance of Trainer.
- Parameters
data_silo (DataSilo) – A DataSilo object that will contain the train, dev and test datasets as PyTorch DataLoaders
checkpoint_root_dir (Path) – Path of the directory where all train checkpoints are saved. Each individual checkpoint is stored in a sub-directory under it.
resume_from_checkpoint (str) – the checkpoint name to start training from, e.g., “epoch_1_step_4532”. It defaults to “latest”, using the checkpoint with the highest train steps.
-
Eval¶
-
class
farm.eval.
Evaluator
(data_loader, tasks, device, report=True)[source]¶ Bases:
object
Handles evaluation of a given model over a specified dataset.
-
__init__
(data_loader, tasks, device, report=True)[source]¶ - Parameters
data_loader (DataLoader) – The PyTorch DataLoader that will return batches of data from the evaluation dataset
label_maps –
device – The device on which the tensors should be processed. Choose from “cpu” and “cuda”.
metrics – The list of metrics which need to be computed, one for each prediction head.
metrics – list
report (bool) – Whether an eval report should be generated (e.g. classification report per class).
-
eval
(model, return_preds_and_labels=False)[source]¶ Performs evaluation on a given model.
- Parameters
model (AdaptiveModel) – The model on which to perform evaluation
return_preds_and_labels (bool) – Whether to add preds and labels in the returned dicts of the
- Return all_results
A list of dictionaries, one for each prediction head. Each dictionary contains the metrics and reports generated during evaluation.
- Rtype all_results
list of dicts
-
Infer¶
-
class
farm.infer.
Inferencer
(model, processor, task_type, batch_size=4, gpu=False, name=None, return_class_probs=False, extraction_strategy=None, extraction_layer=None, s3e_stats=None, num_processes=None, disable_tqdm=False, benchmarking=False, dummy_ph=False)[source]¶ Bases:
object
Loads a saved AdaptiveModel/ONNXAdaptiveModel from disk and runs it in inference mode. Can be used for a model with prediction head (down-stream predictions) and without (using LM as embedder).
Example usage:
# down-stream inference basic_texts = [ {"text": "Schartau sagte dem Tagesspiegel, dass Fischer ein Idiot sei"}, {"text": "Martin Müller spielt Handball in Berlin"}, ] model = Inferencer.load(your_model_dir) model.inference_from_dicts(dicts=basic_texts) # LM embeddings model = Inferencer.load(your_model_dir, extraction_strategy="cls_token", extraction_layer=-1) model.inference_from_dicts(dicts=basic_texts)
-
__init__
(model, processor, task_type, batch_size=4, gpu=False, name=None, return_class_probs=False, extraction_strategy=None, extraction_layer=None, s3e_stats=None, num_processes=None, disable_tqdm=False, benchmarking=False, dummy_ph=False)[source]¶ Initializes Inferencer from an AdaptiveModel and a Processor instance.
- Parameters
model (AdaptiveModel) – AdaptiveModel to run in inference mode
processor (Processor) – A dataset specific Processor object which will turn input (file or dict) into a Pytorch Dataset.
task_type – Type of task the model should be used for. Currently supporting: “embeddings”, “question_answering”, “text_classification”, “ner”. More coming soon…
task_type – str
batch_size (int) – Number of samples computed once per batch
gpu (bool) – If GPU shall be used
name (string) – Name for the current Inferencer model, displayed in the REST API
return_class_probs (bool) – either return probability distribution over all labels or the prob of the associated label
extraction_strategy (str) – Strategy to extract vectors. Choices: ‘cls_token’ (sentence vector), ‘reduce_mean’ (sentence vector), reduce_max (sentence vector), ‘per_token’ (individual token vectors), ‘s3e’ (sentence vector via S3E pooling, see https://arxiv.org/abs/2002.09620)
extraction_layer (int) – number of layer from which the embeddings shall be extracted. Default: -1 (very last layer).
s3e_stats (dict) – Stats of a fitted S3E model as returned by fit_s3e_on_corpus() (only needed for task_type=”embeddings” and extraction_strategy = “s3e”)
num_processes (int) – the number of processes for multiprocessing.Pool. Set to value of 1 (or 0) to disable multiprocessing. Set to None to let Inferencer use all CPU cores minus one. If you want to debug the Language Model, you might need to disable multiprocessing! Warning! If you use multiprocessing you have to close the multiprocessing.Pool again! To do so call
close_multiprocessing_pool()
after you are done using this class. The garbage collector will not do this for you!disable_tqdm (bool) – Whether to disable tqdm logging (can get very verbose in multiprocessing)
dummy_ph (bool) – If True, methods of the prediction head will be replaced with a dummy method. This is used to isolate lm run time from ph run time.
benchmarking (bool) – If True, a benchmarking object will be initialised within the class and certain parts of the code will be timed for benchmarking. Should be kept False if not benchmarking since these timing checkpoints require synchronization of the asynchronous Pytorch operations and may slow down the model.
- Returns
An instance of the Inferencer.
-
classmethod
load
(model_name_or_path, revision=None, batch_size=4, gpu=False, task_type=None, return_class_probs=False, strict=True, max_seq_len=256, doc_stride=128, extraction_layer=None, extraction_strategy=None, s3e_stats=None, num_processes=None, disable_tqdm=False, tokenizer_class=None, use_fast=True, tokenizer_args=None, multithreading_rust=True, dummy_ph=False, benchmarking=False)[source]¶ Load an Inferencer incl. all relevant components (model, tokenizer, processor …) either by
specifying a public name from transformers’ model hub (https://huggingface.co/models)
or pointing to a local directory it is saved in.
- Parameters
model_name_or_path (str) – Local directory or public name of the model to load.
revision (str) – The version of model to use from the HuggingFace model hub. Can be tag name, branch name, or commit hash.
batch_size (int) – Number of samples computed once per batch
gpu (bool) – If GPU shall be used
task_type – Type of task the model should be used for. Currently supporting: “embeddings”, “question_answering”, “text_classification”, “ner”. More coming soon…
task_type – str
strict (bool) – whether to strictly enforce that the keys loaded from saved model match the ones in the PredictionHead (see torch.nn.module.load_state_dict()). Set to False for backwards compatibility with PHs saved with older version of FARM.
max_seq_len (int) – maximum length of one text sample
doc_stride (int) – Only QA: When input text is longer than max_seq_len it gets split into parts, strided by doc_stride
extraction_strategy (str) – Strategy to extract vectors. Choices: ‘cls_token’ (sentence vector), ‘reduce_mean’ (sentence vector), reduce_max (sentence vector), ‘per_token’ (individual token vectors)
extraction_layer (int) – number of layer from which the embeddings shall be extracted. Default: -1 (very last layer).
s3e_stats (dict) – Stats of a fitted S3E model as returned by fit_s3e_on_corpus() (only needed for task_type=”embeddings” and extraction_strategy = “s3e”)
num_processes (int) – the number of processes for multiprocessing.Pool. Set to value of 0 to disable multiprocessing. Set to None to let Inferencer use all CPU cores minus one. If you want to debug the Language Model, you might need to disable multiprocessing! Warning! If you use multiprocessing you have to close the multiprocessing.Pool again! To do so call
close_multiprocessing_pool()
after you are done using this class. The garbage collector will not do this for you!disable_tqdm (bool) – Whether to disable tqdm logging (can get very verbose in multiprocessing)
tokenizer_class (str) – (Optional) Name of the tokenizer class to load (e.g. BertTokenizer)
use_fast (bool) – (Optional, True by default) Indicate if FARM should try to load the fast version of the tokenizer (True) or use the Python one (False).
tokenizer_args (dict) – (Optional) Will be passed to the Tokenizer
__init__
method. See https://huggingface.co/transformers/main_classes/tokenizer.html and detailed tokenizer documentation on Hugging Face Transformers.multithreading_rust (bool) – Whether to allow multithreading in Rust, e.g. for FastTokenizers. Note: Enabling multithreading in Rust AND multiprocessing in python might cause deadlocks.
dummy_ph (bool) – If True, methods of the prediction head will be replaced with a dummy method. This is used to isolate lm run time from ph run time.
benchmarking (bool) – If True, a benchmarking object will be initialised within the class and certain parts of the code will be timed for benchmarking. Should be kept False if not benchmarking since these timing checkpoints require synchronization of the asynchronous Pytorch operations and may slow down the model.
- Returns
An instance of the Inferencer.
-
close_multiprocessing_pool
(join=False)[source]¶ Close the multiprocessing.Pool again.
If you use multiprocessing you have to close the multiprocessing.Pool again! To do so call this function after you are done using this class. The garbage collector will not do this for you!
- Parameters
join (bool) – wait for the worker processes to exit
-
inference_from_file
(file, multiprocessing_chunksize=None, streaming=False, return_json=True)[source]¶ Run down-stream inference on samples created from an input file. The file should be in the same format as the ones used during training (e.g. squad style for QA, tsv for doc classification …) as the same Processor will be used for conversion.
- Parameters
file (str) – path of the input file for Inference
multiprocessing_chunksize (int) – number of dicts to put together in one chunk and feed to one process
streaming (bool) – return a Python generator object that yield results as they get computed, instead of blocking for all the results. To use streaming, the dicts parameter must be a generator and num_processes argument must be set. This mode can be useful to implement large scale non-blocking inference pipelines.
- Returns
an iterator(list or generator) of predictions
- Return type
iter
-
inference_from_dicts
(dicts, return_json=True, multiprocessing_chunksize=None, streaming=False)[source]¶ Runs down-stream inference on samples created from input dictionaries. The format of the input dicts depends on the task:
QA (FARM style): [{“questions”: [“What is X?”], “text”: “Some context containing the answer”}]
Classification / NER / embeddings: [{“text”: “Some input text”}]
Inferencer has a high performance non-blocking streaming mode for large scale inference use cases. With this mode, the dicts parameter can optionally be a Python generator object that yield dicts, thus avoiding loading dicts in memory. The inference_from_dicts() method returns a generator that yield predictions. To use streaming, set the streaming param to True and determine optimal multiprocessing_chunksize by performing speed benchmarks.
- Parameters
dicts (iter(dict)) – Samples to run inference on provided as a list(or a generator object) of dicts. One dict per sample.
return_json (bool) – Whether the output should be in a json appropriate format. If False, it returns the prediction object where applicable, else it returns PredObj.to_json()
multiprocessing_chunksize (int) – number of dicts to put together in one chunk and feed to one process (only relevant if you do multiprocessing)
streaming (bool) – return a Python generator object that yield results as they get computed, instead of blocking for all the results. To use streaming, the dicts parameter must be a generator and num_processes argument must be set. This mode can be useful to implement large scale non-blocking inference pipelines.
- Returns
dict of predictions
- Returns
an iterator(list or generator) of predictions
- Return type
iter
-
extract_vectors
(dicts, extraction_strategy='cls_token', extraction_layer=-1)[source]¶ Converts a text into vector(s) using the language model only (no prediction head involved).
- Example:
basic_texts = [{“text”: “Some text we want to embed”}, {“text”: “And a second one”}] result = inferencer.extract_vectors(dicts=basic_texts)
- Parameters
dicts ([dict]) – Samples to run inference on provided as a list of dicts. One dict per sample.
extraction_strategy (str) – Strategy to extract vectors. Choices: ‘cls_token’ (sentence vector), ‘reduce_mean’ (sentence vector), reduce_max (sentence vector), ‘per_token’ (individual token vectors)
extraction_layer (int) – number of layer from which the embeddings shall be extracted. Default: -1 (very last layer).
- Returns
dict of predictions
-
-
class
farm.infer.
QAInferencer
(*args, **kwargs)[source]¶ Bases:
farm.infer.Inferencer
-
__init__
(*args, **kwargs)[source]¶ Initializes Inferencer from an AdaptiveModel and a Processor instance.
- Parameters
model (AdaptiveModel) – AdaptiveModel to run in inference mode
processor (Processor) – A dataset specific Processor object which will turn input (file or dict) into a Pytorch Dataset.
task_type – Type of task the model should be used for. Currently supporting: “embeddings”, “question_answering”, “text_classification”, “ner”. More coming soon…
task_type – str
batch_size (int) – Number of samples computed once per batch
gpu (bool) – If GPU shall be used
name (string) – Name for the current Inferencer model, displayed in the REST API
return_class_probs (bool) – either return probability distribution over all labels or the prob of the associated label
extraction_strategy (str) – Strategy to extract vectors. Choices: ‘cls_token’ (sentence vector), ‘reduce_mean’ (sentence vector), reduce_max (sentence vector), ‘per_token’ (individual token vectors), ‘s3e’ (sentence vector via S3E pooling, see https://arxiv.org/abs/2002.09620)
extraction_layer (int) – number of layer from which the embeddings shall be extracted. Default: -1 (very last layer).
s3e_stats (dict) – Stats of a fitted S3E model as returned by fit_s3e_on_corpus() (only needed for task_type=”embeddings” and extraction_strategy = “s3e”)
num_processes (int) – the number of processes for multiprocessing.Pool. Set to value of 1 (or 0) to disable multiprocessing. Set to None to let Inferencer use all CPU cores minus one. If you want to debug the Language Model, you might need to disable multiprocessing! Warning! If you use multiprocessing you have to close the multiprocessing.Pool again! To do so call
close_multiprocessing_pool()
after you are done using this class. The garbage collector will not do this for you!disable_tqdm (bool) – Whether to disable tqdm logging (can get very verbose in multiprocessing)
dummy_ph (bool) – If True, methods of the prediction head will be replaced with a dummy method. This is used to isolate lm run time from ph run time.
benchmarking (bool) – If True, a benchmarking object will be initialised within the class and certain parts of the code will be timed for benchmarking. Should be kept False if not benchmarking since these timing checkpoints require synchronization of the asynchronous Pytorch operations and may slow down the model.
- Returns
An instance of the Inferencer.
-
inference_from_dicts
(dicts, return_json=True, multiprocessing_chunksize=None, streaming=False) → Union[List[farm.modeling.predictions.QAPred], Generator[[farm.modeling.predictions.QAPred, None], None]][source]¶ Runs down-stream inference on samples created from input dictionaries. The format of the input dicts depends on the task:
QA (FARM style): [{“questions”: [“What is X?”], “text”: “Some context containing the answer”}]
Classification / NER / embeddings: [{“text”: “Some input text”}]
Inferencer has a high performance non-blocking streaming mode for large scale inference use cases. With this mode, the dicts parameter can optionally be a Python generator object that yield dicts, thus avoiding loading dicts in memory. The inference_from_dicts() method returns a generator that yield predictions. To use streaming, set the streaming param to True and determine optimal multiprocessing_chunksize by performing speed benchmarks.
- Parameters
dicts (iter(dict)) – Samples to run inference on provided as a list(or a generator object) of dicts. One dict per sample.
return_json (bool) – Whether the output should be in a json appropriate format. If False, it returns the prediction object where applicable, else it returns PredObj.to_json()
multiprocessing_chunksize (int) – number of dicts to put together in one chunk and feed to one process (only relevant if you do multiprocessing)
streaming (bool) – return a Python generator object that yield results as they get computed, instead of blocking for all the results. To use streaming, the dicts parameter must be a generator and num_processes argument must be set. This mode can be useful to implement large scale non-blocking inference pipelines.
- Returns
dict of predictions
- Returns
an iterator(list or generator) of predictions
- Return type
iter
-
inference_from_file
(file, multiprocessing_chunksize=None, streaming=False, return_json=True) → Union[List[farm.modeling.predictions.QAPred], Generator[[farm.modeling.predictions.QAPred, None], None]][source]¶ Run down-stream inference on samples created from an input file. The file should be in the same format as the ones used during training (e.g. squad style for QA, tsv for doc classification …) as the same Processor will be used for conversion.
- Parameters
file (str) – path of the input file for Inference
multiprocessing_chunksize (int) – number of dicts to put together in one chunk and feed to one process
streaming (bool) – return a Python generator object that yield results as they get computed, instead of blocking for all the results. To use streaming, the dicts parameter must be a generator and num_processes argument must be set. This mode can be useful to implement large scale non-blocking inference pipelines.
- Returns
an iterator(list or generator) of predictions
- Return type
iter
-
-
class
farm.infer.
FasttextInferencer
(model, name=None)[source]¶ Bases:
object
-
extract_vectors
(dicts, extraction_strategy='reduce_mean')[source]¶ Converts a text into vector(s) using the language model only (no prediction head involved).
- Parameters
dicts ([dict]) – Samples to run inference on provided as a list of dicts. One dict per sample.
extraction_strategy (str) – Strategy to extract vectors. Choices: ‘reduce_mean’ (mean sentence vector), ‘reduce_max’ (max per embedding dim), ‘CLS’
- Returns
dict of predictions
-
Experiment¶
Metrics¶
-
farm.evaluation.metrics.
register_report
(name, implementation)[source]¶ Register a custom reporting function to be used during eval.
This can be useful: - if you want to overwrite a report for an existing output type of prediction head (e.g. “per_token”) - if you have a new type of prediction head and want to add a custom report for it
- Parameters
name (str) – This must match the ph_output_type attribute of the PredictionHead for which the report should be used. (e.g. TokenPredictionHead => per_token, YourCustomHead => some_new_type).
implementation (function) – Function to be executed. It must take lists of y_true and y_pred as input and return a printable object (e.g. string or dict). See sklearns.metrics.classification_report for an example.
-
farm.evaluation.metrics.
squad
(preds, labels)[source]¶ This method calculates squad evaluation metrics a) overall, b) for questions with text answer and c) for questions with no answer
-
farm.evaluation.metrics.
top_n_accuracy
(preds, labels)[source]¶ This method calculates the percentage of documents for which the model makes top n accurate predictions. The definition of top n accurate a top n accurate prediction is as follows: For any given question document pair, there can be multiple predictions from the model and multiple labels. If any of those predictions overlap at all with any of the labels, those predictions are considered to be top n accurate.
-
farm.evaluation.metrics.
text_similarity_acc_and_f1
(preds, labels)[source]¶ Returns accuracy and F1 scores for top-1(highest) ranked sequence(context/passage) for each sample/query
- Parameters
preds (List of numpy array containing similarity scores for each sequence in batch) – list of numpy arrays of dimension n1 x n2 containing n2 predicted ranks for n1 sequences/queries
labels (List of list containing values(0/1)) – list of arrays of dimension n1 x n2 where each array contains n2 labels(0/1) indicating whether the sequence/passage is a positive(1) passage or hard_negative(0) passage
- Returns
predicted ranks of passages for each query
-
farm.evaluation.metrics.
text_similarity_avg_ranks
(preds, labels)[source]¶ Calculates average predicted rank of positive sequence(context/passage) for each sample/query
- Parameters
preds (List of numpy array containing similarity scores for each sequence in batch) – list of numpy arrays of dimension n1 x n2 containing n2 predicted ranks for n1 sequences/queries
labels (List of list containing values(0/1)) – list of arrays of dimension n1 x n2 where each array contains n2 labels(0/1) indicating whether the sequence/passage is a positive(1) passage or hard_negative(0) passage
- Returns
average predicted ranks of positive sequence/passage for each sample/query
-
farm.evaluation.metrics.
text_similarity_metric
(preds, labels)[source]¶ Returns accuracy, F1 scores and average rank scores for text similarity task
- Parameters
preds (List of numpy array containing similarity scores for each sequence in batch) – list of numpy arrays of dimension n1 x n2 containing n2 predicted ranks for n1 sequences/queries
labels (List of list containing values(0/1)) – list of arrays of dimension n1 x n2 where each array contains n2 labels(0/1) indicating whether the sequence/passage is a positive(1) passage or hard_negative(0) passage
:return metrics(accuracy, F1, average rank) for text similarity task
File utils¶
Utilities for working with the local dataset cache. This file is adapted from the AllenNLP library at https://github.com/allenai/allennlp Copyright by the AllenNLP authors.
-
farm.file_utils.
url_to_filename
(url, etag=None)[source]¶ Convert url into a hashed filename in a repeatable way. If etag is specified, append its hash to the url’s, delimited by a period.
-
farm.file_utils.
filename_to_url
(filename, cache_dir=None)[source]¶ Return the url and etag (which may be
None
) stored for filename. RaiseEnvironmentError
if filename or its stored metadata do not exist.
-
farm.file_utils.
download_from_s3
(s3_url: str, cache_dir: str = None, access_key: str = None, secret_access_key: str = None, region_name: str = None)[source]¶ Download a “folder” from s3 to local. Skip already existing files. Useful for downloading all files of one model The default and recommended authentication follows boto3’s trajectory of checking for ENV variables, .aws/credentials etc. (see https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html). However, there’s also the option to pass access_key, secret_access_key and region_name directly as this is needed in some enterprise enviroments with local s3 deployments.
- Parameters
s3_url – Url of the “folder” in s3 (e.g. s3://mybucket/my_modelname)
cache_dir – Optional local directory where the files shall be stored. If not supplied, we’ll use a subfolder in torch’s cache dir (~/.cache/torch/farm)
access_key – Optional S3 Access Key
secret_access_key – Optional S3 Secret Access Key
region_name – Optional Region Name
- Returns
local path of the folder
-
farm.file_utils.
s3_request
(func)[source]¶ Wrapper function for s3 requests in order to create more helpful error messages.
-
farm.file_utils.
fetch_archive_from_http
(url, output_dir, proxies=None)[source]¶ Fetch an archive (zip or tar.gz) from a url via http and extract content to an output directory.
- Parameters
url (str) – http address
output_dir (str) – local path
proxies (dict) – proxies details as required by requests library
- Returns
bool if anything got fetched
-
farm.file_utils.
read_set_from_file
(filename)[source]¶ Extract a de-duped collection (set) of text from a file. Expected file format is one item per line.
-
farm.file_utils.
unnestConfig
(config)[source]¶ This function creates a list of config files for evaluating parameters with different values. If a config parameter is of type list this list is iterated over and a config object without lists is returned. Can handle lists inside any number of parameters.
Can handle nested (one level) configs