hard negative
scroll ↓ to Resources
Note
- Each training sample consists of three objects\images\transactions\etc. - an anchor, a positive and a negative.
- The idea is to make those a hard positive and a hard negative
- This approach forces the model to learn the meaningful boundaries between similar but distinct concepts.
- for a given anchor take positives which are far away in the embedding space (hard positives)
- for a given anchor take negatives which are as close as possible (hard negatives)
- use embedding search to find the most similar object\image\transaction, but from a different class then the anchor and positive
- Actual user data is of utmost importance for creating hard negative examples: valuable negative examples come from user interactions that indicate a mismatch between what the system thought was relevant and what the user actually found useful
Examples
if users delete part of AI-generated content (email, blog, citations), then the query and deleted part are a good hard negative
- recommender systems
- skipping a song is weaker than deleting it from a playlist
- ecommerce
- bought and returned items
- data can also be generated synthetically
- create examples of the same abbreviation in different contexts
- embedder finetuning with triplets and triplet loss
- Evaluating information retrieval block
- user ignores\deletes some retrieved chunks -⇒ Hard Negative
- object retrieval such as face recognition where we train a model to maximize distance between objects from different classes, even if these objects are very similar
Resources
Links to this File
table file.inlinks, file.outlinks from [[]] and !outgoing([[]]) AND -"Changelog"