With BERT Text Classification, ValueError: too many dimensions ‘str’ error occuring

We Are Going To Discuss About With BERT Text Classification, ValueError: too many dimensions ‘str’ error occuring. So lets Start this Python Article.

With BERT Text Classification, ValueError: too many dimensions ‘str’ error occuring

  1. How to solve With BERT Text Classification, ValueError: too many dimensions 'str' error occuring

    I had the same problem:
    This worksfor me I guess you need to do it at the beginning of your code after reading csv:
    df['labels'] = df['labels'].replace(['negative','notr','positive'],[0,1,2])
    then split for training and testing from these labels.

  2. With BERT Text Classification, ValueError: too many dimensions 'str' error occuring

    I had the same problem:
    This worksfor me I guess you need to do it at the beginning of your code after reading csv:
    df['labels'] = df['labels'].replace(['negative','notr','positive'],[0,1,2])
    then split for training and testing from these labels.

Solution 1

I had the same problem:
This worksfor me I guess you need to do it at the beginning of your code after reading csv:
df['labels'] = df['labels'].replace(['negative','notr','positive'],[0,1,2])

then split for training and testing from these labels.

Original Author mojimoji Of This Content

Solution 2

REASON

The issue is you are passing a list of strings (str) in torch.tensor() , it only accepts the list of numerical values (integer, float etc.) .

SOLUTION

So I suggest you to convert your string labels into integer values before passing it to the torch.tensor().

IMPLEMENTATION

Following code might help you

# a temporary list to store the string labels
temp_list = train_labels.tolist()

# dictionary that maps integer to its string value 
label_dict = {}

# list to store integer labels 
int_labels = []

for i in range(len(temp_list)):
    label_dict[i] = temp_list[i]
    int_labels.append(i)

Now pass this int_labels to the torch.tensor and use it as label.

train_y = torch.tensor(int_labels)

and whenever you want to see the respective string label of any integer just use label_dict dictionary.

Original Author coderina Of This Content

Solution 3

Assuming you are using huggingface,

You would need to use 🤗 dataset

python
from datasets import ClassLabel

c2l = ClassLabel(num_classes=2, names=['spam', 'ham'])

labels = ["spam", "ham", "ham"]

[c2l.str2int(label) for label in labels ]
# [0, 1, 1]

For more reference:
https://discuss.huggingface.co/t/converting-string-label-to-int/2816

Original Author NpnSaddy Of This Content

Solution 4

Thanks, it did converting to integer, but there is a problem about classification;

0
0   positive
1   negative
2   positive
3   notr
4   positive
... ...
4002    notr
4003    positive
4004    positive
4005    notr
4006    negative

Frame had that datas, after the convert to int,

0   0
1   1
2   2
3   3
4   4
... ...
4002    4002
4003    4003
4004    4004
4005    4005
4006    4006

it become like that, what I need is all positives , neutrals and negatives representing as 0 for neg-1 for neutral – 2 for pos

Original Author NpnSaddy Of This Content

Conclusion

So This is all About This Tutorial. Hope This Tutorial Helped You. Thank You.

Also Read,

ittutorial team

I am an Information Technology Engineer. I have Completed my MCA And I have 4 Year Plus Experience, I am a web developer with knowledge of multiple back-end platforms Like PHP, Node.js, Python and frontend JavaScript frameworks Like Angular, React, and Vue.

Leave a Comment