393. UTF-8 Validation LeetCode Solution
In this guide, you will get 393. UTF-8 Validation LeetCode Solution with the best time and space complexity. The solution to UTF- Validation problem is provided in various programming languages like C++, Java, and Python. This will be helpful for you if you are preparing for placements, hackathons, interviews, or practice purposes. The solutions provided here are very easy to follow and include detailed explanations.
Table of Contents
- Problem Statement
- Complexity Analysis
- UTF- Validation solution in C++
- UTF- Validation solution in Java
- UTF- Validation solution in Python
- Additional Resources

Problem Statement of UTF- Validation
Given an integer array data representing the data, return whether it is a valid UTF-8 encoding (i.e. it translates to a sequence of valid UTF-8 encoded characters).
A character in UTF8 can be from 1 to 4 bytes long, subjected to the following rules:
For a 1-byte character, the first bit is a 0, followed by its Unicode code.
For an n-bytes character, the first n bits are all one’s, the n + 1 bit is 0, followed by n – 1 bytes with the most significant 2 bits being 10.
This is how the UTF-8 encoding would work:
Number of Bytes | UTF-8 Octet Sequence
| (binary)
——————–+—————————————–
1 | 0xxxxxxx
2 | 110xxxxx 10xxxxxx
3 | 1110xxxx 10xxxxxx 10xxxxxx
4 | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
x denotes a bit in the binary form of a byte that may be either 0 or 1.
Note: The input is an array of integers. Only the least significant 8 bits of each integer is used to store the data. This means each integer represents only 1 byte of data.
Example 1:
Input: data = [197,130,1]
Output: true
Explanation: data represents the octet sequence: 11000101 10000010 00000001.
It is a valid utf-8 encoding for a 2-bytes character followed by a 1-byte character.
Example 2:
Input: data = [235,140,4]
Output: false
Explanation: data represented the octet sequence: 11101011 10001100 00000100.
The first 3 bits are all one’s and the 4th bit is 0 means it is a 3-bytes character.
The next byte is a continuation byte which starts with 10 and that’s correct.
But the second continuation byte does not start with 10, so it is invalid.
Constraints:
1 <= data.length <= 2 * 104
0 <= data[i] <= 255
Complexity Analysis
- Time Complexity: O(n)
- Space Complexity: O(1)
393. UTF-8 Validation LeetCode Solution in C++
class Solution {
public:
bool validUtf8(vector<int>& data) {
int followedBytes = 0;
for (const int d : data)
if (followedBytes == 0) {
if ((d >> 3) == 0b11110)
followedBytes = 3;
else if ((d >> 4) == 0b1110)
followedBytes = 2;
else if ((d >> 5) == 0b110)
followedBytes = 1;
else if ((d >> 7) == 0b0)
followedBytes = 0;
else
return false;
} else {
if ((d >> 6) != 0b10)
return false;
--followedBytes;
}
return followedBytes == 0;
}
};
/* code provided by PROGIEZ */
393. UTF-8 Validation LeetCode Solution in Java
class Solution {
public boolean validUtf8(int[] data) {
int followedBytes = 0;
for (final int d : data)
if (followedBytes == 0) {
if ((d >> 3) == 0b11110)
followedBytes = 3;
else if ((d >> 4) == 0b1110)
followedBytes = 2;
else if ((d >> 5) == 0b110)
followedBytes = 1;
else if ((d >> 7) == 0b0)
followedBytes = 0;
else
return false;
} else {
if ((d >> 6) != 0b10)
return false;
--followedBytes;
}
return followedBytes == 0;
}
}
// code provided by PROGIEZ
393. UTF-8 Validation LeetCode Solution in Python
class Solution:
def validUtf8(self, data: list[int]) -> bool:
followedBytes = 0
for d in data:
if followedBytes == 0:
if (d >> 3) == 0b11110:
followedBytes = 3
elif (d >> 4) == 0b1110:
followedBytes = 2
elif (d >> 5) == 0b110:
followedBytes = 1
elif (d >> 7) == 0b0:
followedBytes = 0
else:
return False
else:
if (d >> 6) != 0b10:
return False
followedBytes -= 1
return followedBytes == 0
# code by PROGIEZ
Additional Resources
- Explore all LeetCode problem solutions at Progiez here
- Explore all problems on LeetCode website here
Happy Coding! Keep following PROGIEZ for more updates and solutions.