[파이썬] 문자열: upper, lower, slicing, split, replace, find, len, strip

Python

[파이썬] 문자열: upper, lower, slicing, split, replace, find, len, strip

eyoo 2022. 4. 19. 11:26

문자열은 작은 따옴표 혹은 큰 따옴표 안에 넣어서 설정할수있다.

in: x = 'Hello World'

in: x

out:

'Hello World'

문자열은 한줄로 이어서 작성할수있으며 줄바꿈 하지 않는다.

다만, 따옴표를 세개 붙여서 사용하면 여러줄로 작성 할수 있다.

in: y = ''' hello

world '''

in: y

out:

'hello\nworld'

in: print(y)

out:

hello
world

'+'연산자를 문자열에 사용하여 두개 이상의 문자열을 합칠수 있다.

in: first_name = 'Mitch'

in: last_name = 'Steve'

in: first_name + last_name

out:

'MitchSteve'

문자열 사이에 공백을 추가 할때는 따옴표에 공간을 둬서 넣으면 된다.

in: full_name = first_name + ' ' + last_name

out:

'Mitch Steve'

여러 문자열 함수를 사용하여 대소문자를 조정할수있다.

모든 글자를 대문자로 만드려 할때

in: full_name.upper()

out:

'MITCH STEVE'

모든 글자를 소문자로 만드려 할때

in: full_name.lower()

out:

'mitch steve'

단어의 앞글자를 대문자로 만드려 할때

in: full_name.title()

out:

'Mitch Steve'

문자열은 split 함수를 사용하여 분리할수있다.

in: full_name.split()

out:

['mitch', 'steve']

# 괄호'()' 안에 특정 문자를 넣어 그 문자를 기준으로 분리할수 있다.

대괄호'[ ]'를 이용하여 문자열을 추출할수 있다.

in: letters = 'abcdefghijklmnopqrstuvwxyz'

in: letters[1]

out:

'b'

# 변수 이름 바로 오른쪽에 대괄호 [ ]
# 대괄호 안에 숫자입력
# 그 숫자를 인덱스 또는 오프셋이라 함
# 인덱스는 0부터 시작한다 (컴퓨터가 자동으로 매기는 숫자)

문자열 제일 끝부분을 추출하려면 대괄호 안에 -1을 넣어 가져올수있다.

in: letters[-1]

out:

'z'

※ 문자열은 immutable 이다. 따라서 한번 생성된 문자열 자체를 바꾸는것은 할 수 없다.

따라서, 새로운 메모리에 변경한 문자열을 새로 만드는 방법을 사용하게 된다.

a를 k로 바꾸고자할때:

잘못된 방식:

in: letters[0] = 'k'

out:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_9784/1726073273.py in <module>
      1 #문자열은 = 를 이용하여 바꿀수 없다
----> 2 letters[0]='k'

TypeError: 'str' object does not support item assignment

에러가 뜨게된다

문자열은 replace 함수를 이용하여 바꿀수있다.

in: letters.replace('a','k')

out:

'kbcdefghijklmnopqrstuvwxyz'

※ 다만 변수의 값이 변하는것이 아니기 때문에 변수 자체를 바꾸려면 따로 변수로 설정 해주어야 한다.

in: letters = letters.replace('a','k')

in: print(letters)

out:

kbcdefghijklmnopqrstuvwxyz

문자열의 일부를 추출하기 (Slicinig 슬라이싱)

[:] 처음부터 끝까지
[start:] start오프셋부터 끝까지
[:end] 처음부터 end-1 오프셋까지
[start : end] start오프셋부터 end-1 오프셋까지
[start : end : step] step만큼 문자를 건너뛰면서, 위와 동일하게 추출

letters에서 c 부터 g까지 가져오기

in: letters[2:7]

out:

'cdefg'

# [ c : g ] 가 아닌 [ 2 : 7 ]과 같은 문자열의 인덱스를 사용해야 한다

#두번째 인덱스는 '미만' 이므로 인덱스에 위치한 문자를 포함하려면 해당 위치에서 +1해야한다

두번째 콜론을 사용하여 문자를 띄어서 가져올수 있다. ( [ : : 2 ] )

in: letters[ : : 2]

out:

'acegikmoqsuwy'

문자열의 길이는 len 함수를 사용하여 몇개의 문자로 되어있는지 알 수 있다.

in: len(letters)

out:

문자열에 공백이 있으면 사람의 눈에는 똑같은 문자라도 컴퓨터는 전혀 다른 문자로 판단한다.

in: email = 'abc@gmail.com'

in: email2 = ' abc@gmail.com'

in: print(len(email))
in: print(len(email2))

out:

13
15

strip 함수를 사용하여 문자열의 공백을 없앨수 있다.

in: email2 = email2.strip()

in: print(email2)

out:

abc@gmail.com

# 괄호'()' 안에 제외하고 싶은 문자를 넣어 공백이 아닌 해당 문자를 제거 할수있다.

문자열 위치 찾기

find 함수는, 찾고자 하는 문자열이 존재하는 곳의 첫번째 오프셋을 알려준다.
rfind 함수는, 찾고자 하는 문자열이 있는 마지막 오프셋을 알려준다.

in: poem = '''So "it is" quite different, then, if in a mountain town
the mountains are close, rather than far. Here

they are far, their distance away established,
consistent year to year, like a parent’s

or sibling’s. They have their own music.
So I confess I do not know what it’s like,

listening to mountains up close, like a lover,
the silence of them known, not guessed at.'''

앞에서 읽는 순서로 'year'를 찾아보기

in: poem.find('year')

out:

뒤에서 읽는 순서로 'year'를 찾아보기

in: poem.rfind('year')

out:

찾을 단어가 없으면 -1로 나온다

poem.find('banana')

-1

count 함수를 통해 몇개의 문자열이 있는지 알수있다.

in: poem.count('year')

out:

세려고 하는 문자가 문자열에 없는경우 0으로 표시된다

in: poem.count('banana')

out:

문자의 갯수가 아닌 문자의 유무를 알고싶을때:

in: 'year' in poem

out:

True

# 그밖에 startswith과 endswith 함수를 통해 첫 단어, 혹은 마지막 단어가 내가 설정한 단어와 맞는지 확인할수있다.