Showing 18 of 92 total issues
File normalizedstring.go
has 568 lines of code (exceeds 500 allowed). Consider refactoring. Open
// Copyright (c) 2020, NLP Odyssey Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package normalizedstring
NormalizedString
has 27 methods (exceeds 20 allowed). Consider refactoring. Open
type NormalizedString struct {
// The original version of the string, before any modification.
original string
// The normalized version of the string, after all modifications.
normalized string
Method NormalizedString.Split
has a Cognitive Complexity of 34 (exceeds 20 allowed). Consider refactoring. Open
func (ns *NormalizedString) Split(
pattern splitpattern.SplitPattern,
behaviour SplitDelimiterBehavior,
) ([]*NormalizedString, error) {
captures, err := pattern.FindMatches(ns.normalized)
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"
Further reading
Method NormalizedString.Split
has 83 lines of code (exceeds 50 allowed). Consider refactoring. Open
func (ns *NormalizedString) Split(
pattern splitpattern.SplitPattern,
behaviour SplitDelimiterBehavior,
) ([]*NormalizedString, error) {
captures, err := pattern.FindMatches(ns.normalized)
Method Word.MergeAll
has a Cognitive Complexity of 32 (exceeds 20 allowed). Consider refactoring. Open
func (w *Word) MergeAll(merges *MergeMap, dropout float64) {
symbolsLen := w.Len()
queue := make(WordMergeHeap, 0, symbolsLen)
skip := make([]WordMerge, 0, symbolsLen)
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"
Further reading
Method WordPieceModel.Tokenize
has 57 lines of code (exceeds 50 allowed). Consider refactoring. Open
func (m *WordPieceModel) Tokenize(sequence string) ([]models.Token, error) {
if len([]rune(sequence)) > m.maxInputCharsPerWord {
unkTokenID, unkTokenExists := m.vocab.GetID(m.unknownToken)
if !unkTokenExists {
return nil, ErrUnknownTokenOutOfVocabulary
Method BPEModel.mergeWord
has a Cognitive Complexity of 26 (exceeds 20 allowed). Consider refactoring. Open
func (m *BPEModel) mergeWord(w string) (*Word, error) {
word := NewWordWithCapacity(len(w))
var unkTokenID int
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"
Further reading
Method Word.MergeAll
has 54 lines of code (exceeds 50 allowed). Consider refactoring. Open
func (w *Word) MergeAll(merges *MergeMap, dropout float64) {
symbolsLen := w.Len()
queue := make(WordMergeHeap, 0, symbolsLen)
skip := make([]WordMerge, 0, symbolsLen)
Method NormalizedString.OriginalAlignments
has 54 lines of code (exceeds 50 allowed). Consider refactoring. Open
func (ns *NormalizedString) OriginalAlignments() []AlignmentRange {
// (start, end) are in alignments
// (offset, length) are in originalAlignments
originalAlignments := make([]AlignmentRange, 0, len(ns.original))
Method NormalizedString.TransformRange
has 51 lines of code (exceeds 50 allowed). Consider refactoring. Open
func (ns *NormalizedString) TransformRange(
rng Range,
dest []RuneChange,
initialOffset int,
) {
Function New
has 8 arguments (exceeds 4 allowed). Consider refactoring. Open
vocab *vocabulary.Vocabulary,
merges *MergeMap,
cacheCapacity int,
dropout float64,
unknownToken string,
Function NewEncoding
has 8 arguments (exceeds 4 allowed). Consider refactoring. Open
ids []int,
typeIDs []int,
tokens []string,
words []int,
offsets []strutils.ByteOffsets,
Function MergeMapFromFile
has 7 return statements (exceeds 4 allowed). Open
func MergeMapFromFile(
filename string,
vocab *vocabulary.Vocabulary,
prefixLength int,
) (m *MergeMap, err error) {
Consider simplifying this complex logical expression. Open
if (i >= '!' && i <= '~') || (i >= 0xA1 && i <= 0xAC) || (i >= 0xAE && i <= 0xFF) {
Method BertPreTokenizer.PreTokenize
has 6 return statements (exceeds 4 allowed). Open
func (b *BertPreTokenizer) PreTokenize(pts *pretokenizedstring.PreTokenizedString) error {
isWhitespacePattern := splitpattern.FromFunc(func(r rune) bool {
return unicode.In(r, unicode.White_Space)
})
isBertPunctuationPattern := splitpattern.FromFunc(func(r rune) bool {
Method NormalizedString.CoerceRangeToOriginal
has 5 return statements (exceeds 4 allowed). Open
func (ns *NormalizedString) CoerceRangeToOriginal(r Range) (OriginalRange, bool) {
// If the string range is already in the original referential, return it as it is
if or, isOriginal := r.(OriginalRange); isOriginal {
return or, true
}
Method WordPieceModel.Tokenize
has 5 return statements (exceeds 4 allowed). Open
func (m *WordPieceModel) Tokenize(sequence string) ([]models.Token, error) {
if len([]rune(sequence)) > m.maxInputCharsPerWord {
unkTokenID, unkTokenExists := m.vocab.GetID(m.unknownToken)
if !unkTokenExists {
return nil, ErrUnknownTokenOutOfVocabulary
Method WordPieceModel.Tokenize
has a Cognitive Complexity of 21 (exceeds 20 allowed). Consider refactoring. Open
func (m *WordPieceModel) Tokenize(sequence string) ([]models.Token, error) {
if len([]rune(sequence)) > m.maxInputCharsPerWord {
unkTokenID, unkTokenExists := m.vocab.GetID(m.unknownToken)
if !unkTokenExists {
return nil, ErrUnknownTokenOutOfVocabulary
- Read upRead up
Cognitive Complexity
Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.
A method's cognitive complexity is based on a few simple rules:
- Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
- Code is considered more complex for each "break in the linear flow of the code"
- Code is considered more complex when "flow breaking structures are nested"