nlpodyssey/gotokenizers

View on GitHub
models/wordpiecemodel/wordpiecemodel.go

Summary

Maintainability
A
2 hrs
Test Coverage

Method WordPieceModel.Tokenize has 57 lines of code (exceeds 50 allowed). Consider refactoring.
Open

func (m *WordPieceModel) Tokenize(sequence string) ([]models.Token, error) {
    if len([]rune(sequence)) > m.maxInputCharsPerWord {
        unkTokenID, unkTokenExists := m.vocab.GetID(m.unknownToken)
        if !unkTokenExists {
            return nil, ErrUnknownTokenOutOfVocabulary
Severity: Minor
Found in models/wordpiecemodel/wordpiecemodel.go - About 1 hr to fix

    Method WordPieceModel.Tokenize has 5 return statements (exceeds 4 allowed).
    Open

    func (m *WordPieceModel) Tokenize(sequence string) ([]models.Token, error) {
        if len([]rune(sequence)) > m.maxInputCharsPerWord {
            unkTokenID, unkTokenExists := m.vocab.GetID(m.unknownToken)
            if !unkTokenExists {
                return nil, ErrUnknownTokenOutOfVocabulary
    Severity: Major
    Found in models/wordpiecemodel/wordpiecemodel.go - About 35 mins to fix

      Method WordPieceModel.Tokenize has a Cognitive Complexity of 21 (exceeds 20 allowed). Consider refactoring.
      Open

      func (m *WordPieceModel) Tokenize(sequence string) ([]models.Token, error) {
          if len([]rune(sequence)) > m.maxInputCharsPerWord {
              unkTokenID, unkTokenExists := m.vocab.GetID(m.unknownToken)
              if !unkTokenExists {
                  return nil, ErrUnknownTokenOutOfVocabulary
      Severity: Minor
      Found in models/wordpiecemodel/wordpiecemodel.go - About 25 mins to fix

      Cognitive Complexity

      Cognitive Complexity is a measure of how difficult a unit of code is to intuitively understand. Unlike Cyclomatic Complexity, which determines how difficult your code will be to test, Cognitive Complexity tells you how difficult your code will be to read and comprehend.

      A method's cognitive complexity is based on a few simple rules:

      • Code is not considered more complex when it uses shorthand that the language provides for collapsing multiple statements into one
      • Code is considered more complex for each "break in the linear flow of the code"
      • Code is considered more complex when "flow breaking structures are nested"

      Further reading

      exported function NewDefault should have comment or be unexported
      Open

      func NewDefault() *WordPieceModel {

      exported method WordPieceModel.Tokenize should have comment or be unexported
      Open

      func (m *WordPieceModel) Tokenize(sequence string) ([]models.Token, error) {

      exported var ErrUnknownTokenOutOfVocabulary should have comment or be unexported
      Open

      var ErrUnknownTokenOutOfVocabulary = fmt.Errorf("the provided unk token is out of vocabulary")

      exported function New should have comment or be unexported
      Open

      func New(

      There are no issues that match your filters.

      Category
      Status