Home Forums Main Forums Python Forum divide data and proper output

  • divide data and proper output

     Tundra.L updated 7 months, 1 week ago 2 Members · 5 Posts
  • Justin

    Administrator
    November 4, 2020 at 2:01 pm
    Up
    1
    Down

    Why not use qcut() function to create a flag variable? It is good for this project.

    • Tundra.L

      Member
      November 4, 2020 at 2:59 pm
      Up
      0
      Down

      thank you.

      this is the qcut function I used:

      cr_label=range(1,11)

      Crdt_seg=pd.qcut(Loan[‘CR_Score’],q=10,labels=cr_label)

      Loan=Loan.assign(credit_seg=Crdt_seg.values)

      Loan.groupby(‘credit_seg’).agg({‘Client_ID’:’count’})

      the output is little different about client numbers in each segments compare with yours. Is there anything wrong in my code?

      • This reply was modified 7 months, 1 week ago by  Tundra.L.
      • This reply was modified 7 months, 1 week ago by  Tundra.L.
      • Justin

        Administrator
        November 4, 2020 at 11:14 pm
        Up
        1
        Down

        No, it’s ok. There may have little difference in creating the bins by SAS Proc Rank and qcut() function. It does NOT matter. It don’t need to be accurate.

        Also, please remember to exclude missing, zero or negative scores from analysis.

        • Tundra.L

          Member
          November 6, 2020 at 2:17 pm
          Up
          0
          Down

          got it. thank you

  • Tundra.L

    Member
    November 4, 2020 at 1:24 pm
    Up
    1
    Down

    For divide credit score to 10 sections, I used code:

    list=np.arange(0,1.1,0.1)

    for i in list:

    q= Loan[‘CR_Score’].quantile(i)

    print(q)

    the output is series:

    101.0
    331.0
    389.0
    418.0
    440.0
    462.0
    484.0
    507.0
    533.0
    576.0
    846.0

    but i want to get the output format like :

    q0=101

    q1=331,…..

    q10=846.

    How can I make it?

Log in to reply.

Original Post
0 of 0 posts June 2018
Now