Discards the document with the given ID, so it won't appear in search results
It has the same visible effect of remove (both cause the
document to stop appearing in searches), but a different effect on the
internal data structures:
remove requires passing the full document to be removed
as argument, and removes it from the inverted index immediately.
discard instead only needs the document ID, and works by
marking the current version of the document as discarded, so it is
immediately ignored by searches. This is faster and more convenient than
remove, but the index is not immediately modified. To take care of
that, vacuuming is performed after a certain number of documents are
discarded, cleaning up the index and allowing memory to be released.
After discarding a document, it is possible to re-add a new version, and
only the new version will appear in searches. In other words, discarding
and re-adding a document works exactly like removing and re-adding it. The
replace method can also be used to replace a document with a
new version.
Details about vacuuming
Repetitive calls to this method would leave obsolete document references in
the index, invisible to searches. Two mechanisms take care of cleaning up:
clean up during search, and vacuuming.
Upon search, whenever a discarded ID is found (and ignored for the
results), references to the discarded document are removed from the
inverted index entries for the search terms. This ensures that subsequent
searches for the same terms do not need to skip these obsolete references
again.
In addition, vacuuming is performed automatically by default (see the
autoVacuum field in SearchOptions) after a certain number of documents
are discarded. Vacuuming traverses all terms in the index, cleaning up
all references to discarded documents. Vacuuming can also be triggered
manually by calling vacuum.
Discards the document with the given ID, so it won't appear in search results
It has the same visible effect of remove (both cause the document to stop appearing in searches), but a different effect on the internal data structures:
remove requires passing the full document to be removed as argument, and removes it from the inverted index immediately.
discard instead only needs the document ID, and works by marking the current version of the document as discarded, so it is immediately ignored by searches. This is faster and more convenient than
remove
, but the index is not immediately modified. To take care of that, vacuuming is performed after a certain number of documents are discarded, cleaning up the index and allowing memory to be released.After discarding a document, it is possible to re-add a new version, and only the new version will appear in searches. In other words, discarding and re-adding a document works exactly like removing and re-adding it. The replace method can also be used to replace a document with a new version.
Details about vacuuming
Repetitive calls to this method would leave obsolete document references in the index, invisible to searches. Two mechanisms take care of cleaning up: clean up during search, and vacuuming.
Upon search, whenever a discarded ID is found (and ignored for the results), references to the discarded document are removed from the inverted index entries for the search terms. This ensures that subsequent searches for the same terms do not need to skip these obsolete references again.
In addition, vacuuming is performed automatically by default (see the
autoVacuum
field in SearchOptions) after a certain number of documents are discarded. Vacuuming traverses all terms in the index, cleaning up all references to discarded documents. Vacuuming can also be triggered manually by calling vacuum.